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Summary 


Most  imagery  and  motion  imagery  data  are  compressed  in  a  lossy  fashion,  implying 
that  the  available  data  are  not  exact  replicas  of  the  data  observed  by  the  sensor  prior 
to  compression.  While  such  compression  errors  are  not  typically  noticed  by  the  viewer, 
there  are  numerous  image  and  video  processing  algorithms  that  are  very  sensitive  to  noise. 
Compression  errors  have  very  noise- like  characteristics,  and  if  one  does  not  properly  account 
for  them  the  algorithms  may  fail  to  give  satisfactory  results.  Algorithms  that  are  highly 
sensitive  to  noise  typically  require  extremely  accurate  models  of  how  the  data  were  acquired, 
and  lossy  compression  forms  an  integral  component  of  the  acquisition  process  whose  influence 
must  not  be  ignored. 

The  current  document  examines  this  compression  noise  for  the  situation  when  the  com¬ 
pression  makes  use  of  scalar  quantization  of  the  data’s  wavelet  coefficients.  Application 
scenarios  include,  but  are  not  limited  to,  the  following:  imagery  compressed  according  to 
the  international  JPEG  2000  standard;  imagery  compressed  according  to  the  SPIHT  or  EZW 
compression  techniques;  motion  imagery  compressed  according  to  the  international  Motion 
JPEG  2000  standard;  and  motion  imagery  compressed  according  to  the  three-dimensional 
SPIHT  technique. 

Compression  errors  have  unique  behavior.  This  technical  report  provides  a  statistical  char¬ 
acterization  of  the  compression  noise  which  can  be  used  in  general  scenarios  that  are  sensitive 
to  noise  in  the  data.  Two  example  applications  are  presented:  image  deblurring  for  the  case 
of  a  single  image,  and  temporal  filtering  for  the  case  of  motion  imagery.  Results  suggest  that 
improvements  for  these  applications  are  obtainable  by  using  the  statistical  characterizations 
relative  to  simpler  models;  the  cost  of  such  improvements  is  the  additional  complexity  of  the 
more  accurate  model.  Other  situations  that  may  benefit  from  this  work  include  the  following: 
super-resolution  image  reconstruction;  deconvolution;  image  segmentation  and  classification; 
and  statistical  pattern  and  target  recognition. 
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1.  Introduction 


Using  wavelet  transforms  for  compression  of  images  is  becoming  increasingly  popular.  A 
two-dimensional  discrete  wavelet  transform  (DWT)  is  often  applied  for  compression  of  still 
images,  as  well  as  for  individual  frames  of  a  motion  imagery  sequence.  Three-dimensional 
wavelet  transforms  are  also  applied  to  imagery  volumes  such  as  motion  imagery  sequences 
and  hyperspectral  imagery  data.  In  either  case,  when  such  compression  is  lossy  there  is  an 
error  in  the  reconstructed  image  due  to  quantization  of  the  wavelet  coefficients.  In  many  im¬ 
age  processing  applications,  it  may  be  beneficial  to  be  able  to  characterize  the  reconstruction 
error — for  example,  numerous  algorithms  in  image  processing  use  probabilistic  models  for 
the  observation  noise,  and  more  accurate  probabilistic  models  lead  to  more  accurate  results 
from  the  overall  algorithm.  Characterization  of  compression  error  is  especially  important 
when  such  error  is  the  dominant  source  of  noise  in  a  system. 

Compression  noise  has  been  studied  thoroughly  for  the  case  of  the  discrete  cosine  transform 
(DCT)  [1],  which  found  and  continues  to  find  extensive  use  in  compression  of  images  and 
video.  However,  wavelet  compression  noise  has  not  been  studied  in  as  much  detail.  Woods 
and  Naveen  [2]  considered  the  compression  distortion  for  purposes  of  bit  allocation,  but 
consider  the  error  from  a  collective  point  of  view  (total  error  energy  being  the  sum  of  the 
total  error  energy  from  each  subband)  rather  than  the  local  point  of  view  (down  to  the  pixel 
level)  taken  here.  To  see  the  limitations  of  such  a  global  point  of  view,  consider  the  image  of 
Figure  1,  which  shows  the  mean-squared  error  averaged  over  1320  images  of  size  256  x  256 
that  were  compressed  according  to  the  wavelet-based  JPEG  2000  standard  [3] .  Bright  pixels 
in  the  figure  correspond  to  higher  mean-square  errors  at  those  pixel  positions.  An  obvious 
conclusion  from  the  image  is  that  the  variances  of  the  errors  are  changing  according  to  pixel 
location,  i.e.,  the  pixel  errors  are  not  identically  distributed. 

Not  only  are  the  compression  errors  reported  in  Figure  1  spatially  varying,  but  they  are 
correlated  as  well.  Table  1  shows  the  correlation  of  the  error  at  pixel  (66, 66)  with  the 
errors  of  surrounding  pixels,  which  clearly  demonstrates  that  there  are  significant  corre¬ 
lations  in  the  quantization  error  that  should  not  be  ignored  by  algorithms  that  process 
this  compressed  imagery.  Although  in  practice  it  is  tempting  to  assume  independent  and 
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Fig.  1.  Mean-squared  error  for  1320  images  of  size  256  x  256  compressed  according  to  JPEG-2000  at  10:1  compression 
ratio  with  128  x  128  tiles,  with  five  levels  of  wavelet  decomposition.  The  image  has  been  slightly  brightened  to 
emphasize  details.  Note  that  the  block  box  surrounding  the  image  is  part  of  the  axis,  and  not  part  of  the  error  image. 


identically  distributed  (HD)  noise  due  to  its  simplicity,  the  examples  just  shown,  in  addition 
to  the  derivations  and  experimental  evidence  to  follow  in  later  sections,  demonstrate  that  the 
quantization  error  is  not  independent  nor  identically  distributed,  and  proper  consideration 
of  the  noise’s  statistical  characteristics  can  be  beneficial. 

Error  patterns  such  as  that  shown  in  Figure  1  are  readily  evident  when  applying  a  three- 
dimensional  wavelet  transform  to  image  volumes.  However,  in  this  case  there  are  peaks 
and  valleys  in  the  mean-squared  error  in  the  two  spatial  dimensions  as  well  as  the  third 
dimension.  Examples  for  the  three-dimensional  case  are  given  in  later  sections. 

The  tile  boundary  effect  that  is  apparent  in  Figure  1  is  well  known,  and  there  are  various 
methods  of  counteracting  such  a  problem  [4]  [5].  Nevertheless,  even  if  one  carefully  compresses 
an  image  so  as  not  to  produce  excessive  boundary  effects,  the  error  throughout  a  compressed 
tile  will  still  exhibit  spatial-domain  errors  that  vary  in  a  manner  similar  to  that  of  Figure  1. 

In  general,  the  quantization  error  in  the  pixel  domain  will  be  the  sum  of  the  product  of 
the  wavelet  coefficients’  quantization  errors  times  the  corresponding  basis  image  for  each 
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TABLE  1 


Error  correlations  for  a  single  pixel  location  from  the  experiment  in  Eigure  1.  The  peak  center 

VALUE  CORRESPONDS  TO  THE  MEAN-SQUARE  ERROR  FROM  THE  (66,66)  LOCATION  IN  EiGURE  1,  WHILE,  FOR 
EXAMPLE,  THE  ENTRY  BELOW  THE  PEAK  CORRESPONDS  TO  THE  CORRELATION  BETWEEN  ERRORS  AT  LOCATIONS 

(66,66)  AND  (67,66). 


1.23  -0.77  0.10 
1.57  -0.72  0.40 
1.94  -1.53  0.98 

3.82  -0.84  -1.08 
2.59  0.04  -2.59 
2.12  -0.82  -1.04 
1.31  -1.85  -1.29 
1.08  1.60  1.87 

1.37  0.09  1.76 


-2.71  -2.54  -1.54 
-1.49  0.33  -0.15 

0.35  1.18  0.73 

2.57  10.59  1.97 

5.84  22.49  6.63 
3.38  10.73  4.42 
0.79  0.89  1.44 

-1.09  -2.79  1.15 

-0.67  -3.55  0.78 


-0.16  0.66  1.28 
0.24  0.21  0.81 

0.11  -0.42  -0.47 
-1.59  -0.33  0.22 

-1.58  -2.44  -1.14 
-1.53  -2.90  -2.55 
-0.67  -0.40  -1.24 
0.38  0.95  1.51 

1.33  1.37  0.56 


coefficient.  Such  a  statement  is  obvious  and  has  been  pointed  out,  for  example,  by  Watson 
et  al.  [6].  However,  the  authors  of  [6]  were  primarily  concerned  with  the  perceptibility  of 
wavelet  quantization  noise,  whereas  the  focus  of  this  paper  is  a  probabilistic  examination  of 
the  quantization  noise.  Such  a  probabilistic  model  of  the  quantization  noise  could  then  be 
used,  for  example,  in  imagery  or  video  restoration. 

The  next  section  provides  an  analysis  of  the  spatial-domain  effect  of  wavelet  quantization 
noise,  the  main  result  of  which  is  a  covariance  matrix  for  the  error  that  can  be  used 
in  conjunction  with  various  probability  distribution  functions  (pdf).  Section  3  discusses 
implications  of  Section  2,  including  example  predictions  of  quantization  noise  behavior  based 
on  the  noise  model,  as  well  as  further  analysis  of  the  observed  pixel  errors  reported  in 
Figure  1.  Section  4  provides  an  example  restoration  formulation  for  image  deblurring  that 
makes  use  of  the  quantization  noise  model.  A  further  example  of  temporal  filtering  for  the 
case  of  motion  imagery  is  presented  in  Section  5.  Results  in  both  example  cases  demonstrate 
that  using  the  proposed  quantization  error  model  allows  improvements  relative  to  the  simple 
but  common  independent  and  identically  distributed  noise  model.  Finally,  conclusions  are 
presented  in  Section  6. 
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2.  Quantization  Noise 

Without  loss  of  generality,  we  represent  the  pixel-domain  image  data  by  the  length-N 
vector  z,  which  is  formed  by  stacking  the  columns  from  either  a  two-dimensional  image  or  the 
images  of  a  three-dimensional  imagery  volume.  For  notational  convenience  it  will  be  assumed 
that  z  represents  a  single  W  x  W  image;  generalizations  to  non-square  images  and  motion 
imagery  are  straightforward.  The  multiresolution  DWT  being  employed  is  represented  by  the 
matrix  H,  of  size  N  xN,  which  can  be  either  the  two-  or  three-dimensional  DWT,  depending 
on  the  situation.  The  wavelet  coefficients  are  given  as  y  =  Hz,  which  are  quantized  to  = 
Hz -hey,  where  Gy  represents  the  error  due  to  quantization  of  the  wavelet  coefficients.  Upon 
application  of  the  inverse  wavelet  transform,  the  reconstructed  image  becomes  Zg  =  z  -h  e^, 
where  the  spatial  domain  quantization  error  is  =  H“^ey,  where  H“^  is  the  inverse  DWT. 

An  alternative  but  equivalent  way  of  looking  at  an  image  is  in  terms  of  the  basis  images 
of  the  inverse  DWT.  Then, 

^  =  Y.K!vY[u,v],  (1) 

where  Y[u,  v]  is  the  (u,  element  of  the  DWT  decomposition  represented  by  y,  and  \i~\  is 
the  basis  image  corresponding  to  the  (u,  wavelet  coefficient,  equivalent  to  the  {vW -hu)*^ 
column  of  H“^.  In  a  similar  manner,  the  quantization  error  can  be  written  as 

^^^Y.KIEy[u,v\.  (2) 

u,v 

Watson  et  ah  [6]  base  much  of  their  quantization  noise  perceptibility  study  on  the  error 
terms  [u,  n]. 

While  the  quantization  error  is  a  deterministic  function  of  an  input  image,  in  many  prac¬ 
tical  applications  once  the  quantized  signal  y^  is  calculated,  the  clean  signal  y  is  discarded, 
and  thus  explicit  information  about  the  quantization  error  is  lost.  A  commonly  used  theoretic 
tool  for  modeling  the  error  signal  is  to  treat  it  as  a  random  quantity  [7].  Treating  the  error 
as  random  provides  an  understanding  of  how  the  error  behaves,  including  how  much  the 
error  will  vary  at  different  pixel  locations  and  the  correlations  between  the  errors.  Such 
understanding  further  provides  the  theoretical  framework  for  formulating  effective  schemes 
for  alleviating  the  error.  This  statistically-based  model  of  the  error  signal  is  referred  to  as 
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a  quantization  noise  model  since  the  error  signal  represents  unwanted  information  in  the 
resulting  image  representation.  In  the  context  of  DCT-based  image  compression,  there  are 
numerous  cases  that  treat  the  compression  error  as  a  random  quantity;  some  do  so  in  order 
to  analyze  visibility  [8]  or  characteristics  [9]  of  the  error,  while  others  do  so  in  order  to 
formulate  algorithms  that  attempt  to  remove  the  noise  [10]  [11]  [12],  This  work  is  interested 
in  both  the  characterization  and  the  alleviation  of  the  compression  noise. 

The  original  image  data  z  has  a  covariance  matrix  of  Kz,  which  results  in  a  covariance 
matrix  for  the  wavelet  coefficients  of  Ky  =  HK^H*.  For  images  well- modeled  by  a  separable 
extension  of  first-order  stationary  Markov  sequences  with  positive  correlation  parameter  p  (a 
simple  but  common  image  model  [13]),  many  wavelet  transforms  will  approximately  provide 
uncorrelated  transform  coefficients.  The  coefficients  are  not  strictly  uncorrelated,  as  is  readily 
evident  by  looking  at  the  lowest  frequency  band  of  a  wavelet  decomposition;  the  decorrelating 
effect  also  depends  on  the  number  of  levels  of  the  wavelet  decomposition.  Here,  it  is  assumed 
that  the  coefficients  are  approximately  uncorrelated,  allowing  the  simplifying  approximation 
that  Ky  is  diagonal,  which  leads  to  the  approximation  that  the  covariance  matrix  of  ey,  Kgy, 
is  approximately  diagonal,  and  consists  of  the  wavelet  domain  quantization  error  variances 
for  each  coefficient.  Given  Key,  the  covariance  of  the  quantization  error  in  the  spatial  domain 
can  then  be  found  as 


K 


Gz 


H-^Ke  H 


-t 


H 


-1 


H-^K, 


(3) 


showing  that  the  error  covariance  in  the  pixel  domain  depends  only  on  the  error  variances 
in  the  wavelet  domain  and  the  basis  images  of  the  inverse  wavelet  transform.  Individual 
elements  of  Ke,^  can  be  easily  expressed  in  terms  of  the  basis  images  h“[,, 

(4) 

u,v 

where  (7y[u,u]  are  the  DWT-domain  error  variances  that  compose  Kgy,  and  h~l[m]  repre¬ 
sents  the  element  of  A  special  case  of  (4)  considers  the  diagonal  elements  of  Ke,^, 
which  represent  the  variances  of  the  pixel-domain  quantization  errors.  Quantization  error 
variances  are  themselves  of  significant  interest;  the  results  of  Figure  1  represent  estimates 
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of  these  terms  for  one  set  of  data.  In  general, 


diag(KeJ  =  ^  crl[u,  v]  (5) 

U,V 

where  the  square  of  the  vector  indicates  that  each  element  of  the  vector  is  squared,  and 

r  1  2 

diag(Ke,,)  is  the  vector  taken  from  the  diagonal  of  Kg,,.  The  terms  above  can  be 

considered  “error  variance  basis  images,”  since  the  error  variances  of  an  image  can  be  written 
in  terms  of  the  basis  summation  of  (5).  Examples  of  such  variance  basis  images  are  given 
in  the  next  section. 

It  is  argued  here  that  a  Gaussian  probability  distribution  function  provides  a  good  de¬ 
scription  of  the  quantization  error  in  the  pixel  domain.  The  primary  justihcation  is  due  to 
the  basis  image  summation  that  forms  a  reconstructed  pixel  error — the  quantization  error 
for  a  single  pixel  will  consist  of  the  sum  of  quantization  errors  for  each  basis  image  that 
overlaps  with  the  pixel,  as  described  by  (2).  The  number  of  such  basis  images  depends  on 
several  factors,  including  the  length  of  the  wavelet  reconstruction  hlters,  the  levels  and  type 
of  wavelet  decomposition,  and  the  location  of  the  pixel  within  the  fixed- length  transform 
size.  For  most  situations  of  interest,  the  number  of  contributing  terms  is  large  enough  such 
that  one  may  use  the  Central  Limit  Theorem  to  approximate  the  sum  of  random  variables 
represented  by  noisy  basis  images  as  Gaussian.  Thus,  the  probability  distribution  function 
of  the  pixel-domain  quantization  error  is  approximated  as 


P(^z)  = - ^ - 777  exp  <  --e*  K„  Bz  \ . 

(27r)^/2  2  ^  J 

The  exponent  in  the  above  equation  can  be  expanded  according  to  (3), 


--(HeJ'K-yHe, 


showing  that  the  exponent  of  the  probability  distribution  can  be  evaluated  by  simple  appli¬ 
cation  of  the  wavelet  transform  followed  by  scaling  according  to  the  diagonal 

Other  distributions  besides  Gaussian  are  possible,  although  the  Gaussian  distribution  is 
certainly  the  most  convenient  both  from  a  computational  standpoint  and  from  the  standpoint 
of  including  arbitrary  covariance  matrices  as  is  necessary  in  the  problem  at  hand.  One 
alternative  to  a  Gaussian  distribution  is  a  multivariate  Laplace  distribution,  which  more 
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closely  models  heavy  tails  of  a  noise  source.  Various  definitions  of  a  multivariate  Laplace 
distribution  exist;  Kotz  et  al.  [14]  define  it  in  terms  of  its  characteristic  function 


= 


1  + 


u 


(8) 


2'- 

Unfortunately,  using  (8)  in  a  general  image  or  video  restoration  formulation  can  be  prob¬ 
lematic  due  to  the  probability  density  function  corresponding  to  the  above  characteristic 
function — the  explicit  density  involves  Bessel  functions,  which  can  prove  difficult  from  a 
computational  point  of  view  in  typical  maximum  likelihood  (ML)  or  maximum  a  posteriori 
(MAP)  restoration  formulations. 

From  a  computational  standpoint,  a  more  reasonable  version  of  the  multivariate  Laplace 
distribution  is  given  by  the  following,  taken  from  [15]: 


p(ez)  =  exp  {-[e*C  ,  (9) 

where  A'jv,?)  is  a  constant  depending  on  ry  and  N  [15].  Equation  (9)  is  a  general  distribution 
parameterized  by  rj]  choosing  rj  =  2  yields  the  common  multivariate  Gaussian  distribution, 
while  choosing  ry  =  1  yields  a  multivariate  Laplace  distribution.  For  the  situation  at  hand 
we  are  interested  in  ry  =  1,  for  which  case  the  covariance  of  in  (9)  is  {N  +  1)C.  Choosing 
C  =  K.e^/{N  +  1)  thus  yields  the  desired  covariance  for  the  error  in  (9). 

Although  not  recommended  here,  it  is  possible  to  use  independently-distributed  Gaussian 
or  Laplacian  noise  sources  to  model  the  quantization  noise  in  the  spatial  domain.  The  obvious 
disadvantage  of  such  a  method  is  that  the  correlation  structure  contained  in  Ke^  is  lost. 
However,  by  assigning  the  variances  of  these  N  independent  noise  terms  according  to  the 
elements  of  the  diagonal  elements  of  as  in  (5),  one  can  still  capture  the  variance  behavior 
that  was  illustrated  in  Figure  1.  The  main  advantage  of  such  an  approach  is  a  potential 
reduction  in  computational  complexity. 

Given  a  distribution  type,  the  individual  wavelet  domain  quantization  error  variances 
that  compose  Kgy  determine  the  error  in  the  pixel  domain.  For  high-rate  situations,  the 
quantization  step  sizes  used  for  compression  are  small  enough  that  a  uniformly  distributed 
random  variable  accurately  models  the  quantization  error  in  the  wavelet  domain;  Watson 
et  al.  [6]  also  use  this  model.  In  such  cases,  the  diagonal  components  of  Kgy  are  composed 


of  terms  where  A,  is  the  quantization  step  size  for  the  wavelet  coeihcient.  For 

lower-rate  situations,  one  must  resort  to  more  complicated  means:  A  distribution  function 
must  be  assumed  for  each  wavelet  coefficient,  whose  parameters  must  be  estimated  from 
the  received  (noisy)  data;  the  quantization  noise  can  then  be  calculated  for  each  wavelet 
coefficient.  See  [1]  for  such  an  analysis  in  the  case  of  DCT  quantization  noise,  where  a 
Laplacian  distribution  is  assumed  for  each  DCT  coefficient.  Although  a  lower-rate  analysis 
for  the  case  of  the  wavelet  transform  could  be  performed  analogously  to  that  of  the  DCT, 
such  situations  were  not  the  focus  of  this  work,  and  are  not  considered  further.  In  general, 
however,  the  original  image  statistics  dictate  the  signal  energy  of  the  wavelet  coefficients, 
and  hence  the  energy  of  the  quantization  errors  in  the  wavelet  domain;  such  analysis  is  an 
interesting  area  for  further  investigation. 
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3.  Discussion 


Figure  2  shows  the  predicted  PSNR  for  each  of  16  video  frames  that  are  compressed 
by  uniformly  quantizing  the  coefficients  from  a  temporal  three-level  wavelet  decomposition 
with  the  9/7  biorthogonal  wavelet  filters  (note  that  this  example  only  considers  the  overall 
frame-by-frame  error,  not  the  pixel- by-pixel  error).  The  plot  correlates  very  well  with  PSNR 
results  reported  for  the  3D  SPIHT  video  compression  algorithm  [16],  which  compresses 
imagery  volumes  by  quantizing  their  3D  DWT  coefficients.  Such  an  analysis  of  frame-by- 
frame  quantization  error  was  previously  performed  by  Xu  et  ah  [17],  who  propose  a  wavelet 
transform  that  eliminates  the  low  PSNR’s  at  the  group-of-frames  boundaries  that  are  evident 
in  Figure  2.  Nevertheless,  regardless  of  whether  or  not  a  wavelet  transform  exhibits  boundary 
effects,  there  will  still  be  a  predicted  error  pattern  that  depends  on  the  reconstruction 
transform’s  basis  vectors.  It  is  the  variability  of  error  from  frame  to  frame  that  can  be 
quite  important  for  an  algorithm  that  is  processing  the  data. 

Figure  3  shows  error  histograms  from  the  experiment  reported  in  Figure  1.  Two  histograms 
are  shown — one  for  the  (0,  0)  location  of  every  4x4  block  in  the  images,  and  the  other  for 
the  (1, 1)  location  of  every  4x4  block.  If  one  treats  these  histograms  as  probability  mass 
functions,  the  variance  for  the  (1, 1)  location  is  approximately  twice  that  of  the  (0,  0)  location, 


Fig.  2.  Predicted  PSNR  for  16  frames  when  the  wavelet  coefficients  of  a  temporal  three- level  9/7  biorthogonal 
wavelet  decomposition  are  quantized  with  uniform  quantizers  at  high  rate. 
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Fig.  3.  Normalized  error  histograms  for  the  test  sequence  in  Figure  1.  The  solid  line  represents  errors  for  pixels  at 
locations  (x,y)  such  that  x  mod  4  =  0  and  y  mod  4  =  0;  the  connected-dot  line  represents  errors  for  pixel  locations 
(x,y)  where  x  mod  4=1  and  y  mod  4  =  1.  The  left-most  plots  show  the  error  histograms  at  two  different  horizontal 
magnifications,  and  the  right-most  plots  show  the  left-most  plots  on  a  logarithmic  vertical  axis. 


again  suggesting  that  not  all  pixel  errors  behave  in  the  same  manner.  From  the  logarithmic 
histogram  plots,  it  is  apparent  that  these  error  distributions  are  not  closely  following  a 
Gaussian;  if  they  were,  they  would  appear  as  inverted  parabolas.  One  might  argue  that  they 
more  closely  follow  a  Laplacian  or  Generalized  Gaussian  distribution,  although  the  convexity 
changes  that  are  apparent  in  the  log-scale  plots  suggests  otherwise.  Although  the  plots  of 
Figure  3  might  tempt  one  to  discard  the  Gaussian  quantization  noise  model  in  favor  of  a 
heavier-tailed  distribution,  such  an  action  is  sometimes  unnecessary — as  discussed  in  the 
next  sections,  for  several  cases  of  restoring  wavelet-compressed  images  the  Gaussian  noise 
model  outperforms  other  Laplacian  noise  models.  Note  also  that  these  tests  are  for  one 
particular  compression  ratio  and  one  particular  data  set,  and  the  shapes  of  the  distributions 
may  vary  at  different  compression  ratios. 

Figure  4  shows  predicted  quantization  error  variances  for  a  situation  similar  to  that 
shown  in  Figure  1,  where  the  prediction  is  taken  from  the  diagonal  components  of  Ke,,. 
Uniform  dead-zone  scalar  quantization  is  applied  to  each  wavelet  coefficient  of  subband  b 
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Fig.  4.  Predicted  error  variance  for  compression  of  four  tiles,  taken  from  diag(Kez)- 

with  quantization  step  sizes 

At  =  4^.  (10) 

\/76 

where  7;,  is  the  sum  of  the  squared  errors  introduced  by  a  unit  error  in  subband  6  at  a  location 
away  from  the  boundaries,  and  scales  the  nominal  quantizer  [2],  The  Daubechies  9/7 
DWT  [18]  is  the  transform  employed.  A  comparison  between  zoomed  portions  of  Figures  1 
and  4  is  presented  in  Figure  5;  a  remarkable  similarity  between  observed  and  predicted  error 
variances  is  apparent.  Note  that  the  predictions  shown  in  Fignres  4  and  5  are  based  on  the 
assnmption  that  all  wavelet  coefficients  are  non-zero,  and  thus  the  larger  quantization  bins 
for  zero-valued  coefficients  (due  to  the  dead  zone)  do  not  contribute. 

While  Figure  2  showed  only  the  temporal  variation  in  error  for  compression  using  a  three- 
dimensional  DWT,  there  is  also  significant  pixel  variation  as  was  present  in  Figure  4.  Figure  6 
shows  both  spatial  and  temporal  mean- square  error  prediction  for  a  64  x  64  sequence  of  16 
frames,  which  has  been  compressed  according  to  quantization  of  a  three-level  (both  spatial 
and  temporal)  3D  wavelet  decomposition.  The  sixteen  individual  error  frames  have  been 
displayed  in  a  4  x  4  tile,  with  the  first  image  at  the  top-left  and  the  last  image  at  the 


12 


Fig.  5.  Zoomed  portions  of  Figures  1  (left)  and  4  (right),  near  the  centers  of  the  images.  (The  ranges  of  both  images 
have  been  normalized  to  the  same  range  for  display  purposes.) 


bottom- right,  with  intermediate  images  occurring  in  row  order  between  these  two.  The 
overall  brightness  of  each  individual  tile  directly  corresponds  to  the  errors  depicted  in 
Figure  2.^  However,  Figure  6  clearly  demonstrates  that  there  is  significant  variation  in  the 
errors  according  to  both  temporal  and  spatial  pixel  location. 

The  variance  basis  images  determine  how  the  overall  pixel-domain  error  variances  will 
behave,  as  described  by  (5).  Figure  7  shows  four  such  basis  images,  taken  from  each  of  the 
four  subbands  of  the  lowest  level  of  a  five-level  wavelet  decomposition  for  128  x  128  images. 
Summations  of  variance  basis  images  such  as  these  lead  to  overall  variance  images  such  as 
the  one  shown  in  Figure  4.  Basis  images  at  higher- frequency  decomposition  levels  appear 
similar  to  those  shown  in  the  figure,  but  with  less  spatial  extent.  Since  the  example  in  the 
figure  is  for  128  x  128  images,  there  are  128^  different  basis  images  in  total;  however,  for 
each  subband  most  of  these  basis  images  are  simple  translates  of  each  other. 

Table  2  shows  two  predictions  of  the  covariance  of  the  error  for  the  (66,  66)  position,  as 
was  reported  for  experimental  observations  in  Table  1 .  The  predictions  have  been  normalized 
such  that  the  variance  value  is  100.0.  Again,  the  peak  center  value  corresponds  to  the 
variance  of  the  error  from  the  (66,66)  location.  For  part  (a)  of  the  table,  predictions  are 
computed  assuming  no  zero-valued  coefficients  were  received  by  the  decoder,  and  hence  no 


^Note  that  Figure  2  plotted  PSNR,  where  higher  values  indicate  higher  quality,  whereas  Figure  6  plots  mean-square 
error,  where  higher  values  indicate  poorer  quality. 
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Fig.  6.  Predicted  quantization  error  variance  for  a  length- 16  volume  of  64  x  64  images,  compressed  using  a  three- 
level  (both  temporal  and  spatial)  wavelet  decomposition.  The  64  x  64  error  variance  images  are  tiled  from  top-left  to 
bottom-right. 


larger  quantization  bins  due  to  the  quantizer  dead  zone.  In  part  (b)  of  the  table,  predictions 
are  computed  by  randomly  assigning  zero  and  non-zero  coelRcients,  and  hence  the  larger 
quantization  bins  for  dead-zone  coefficients  affect  the  result  (note  that  this  prediction  is  just 
one  realization  of  many  possible  zero/non-zero  coefficient  scenarios,  and  different  correlations 
occur  for  different  configurations  of  zero/non-zero  observations).  Predicted  correlations  for 
part  (a)  of  the  table  indicate  only  minor  correlations  between  the  errors  at  pixel  (66,  66) 


14 


Fig.  7.  Four  error  variance  basis  images,  taken  from  the  four  subbands  of  the  lowest  level  of  a  five-level  wavelet 
decomposition  for  128  x  128  images  (black  represents  0). 


and  its  neighbors,  and  one  might  argue  that  approximating  them  as  independent  would 
be  valid.  However,  as  part  (b)  of  the  table  clearly  demonstrates,  different  quantization  bin 
sizes  that  arise  as  a  result  of  dead-zone  quantization  can  introduce  significant  correlations 
between  errors,  and  for  this  particular  example  such  correlations  exist  beyond  the  eight 
nearest  neighbors. 
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TABLE  2 


Normalized  prediction  oe  error  correlations  eor  a  single  pixel  location  under  the  same  conditions 

REPORTED  IN  FIGURE  1.  (a)  PREDICTIONS  ARE  COMPUTED  ASSUMING  NO  ZERO- VALUED  QUANTIZED  COEEEICIENTS; 
(b)  PREDICTIONS  ARE  COMPUTED  BY  RANDOMLY  ASSIGNING  ZERO  AND  NON-ZERO  QUANTIZED  COEEEICIENTS. 


-0.27 

-0.15 

-0.62 

-3.03 

-6.36 

-2.81 

-0.21 

0.30 

0.18 

-0.15 

-0.38 

-1.09 

-2.51 

-5.68 

-2.15 

-0.43 

0.36 

0.58 

-0.62 

-1.09 

-0.95 

-0.63 

4.91 

-0.14 

-0.06 

-0.10 

0.35 

-3.03 

-2.51 

-0.63 

2.81 

10.16 

3.35 

0.34 

-1.42 

-1.96 

-6.36 

-5.68 

4.91 

10.16 

100.00 

10.72 

5.92 

-4.56 

-5.27 

-2.81 

-2.15 

-0.14 

3.35 

10.72 

3.90 

0.85 

-1.03 

-1.73 

-0.21 

-0.43 

-0.06 

0.34 

5.92 

0.85 

0.86 

0.60 

0.77 

0.30 

0.36 

-0.10 

-1.42 

-4.56 

-1.03 

0.60 

1.14 

1.06 

0.18 

0.58 

0.35 

-1.96 

-5.27 

-1.73 

0.77 

1.06 

0.65 

(a) 

0.48 

1.57 

1.66 

-3.94 

-9.14 

-3.56 

2.34 

2.40 

1.19 

0.52 

0.72 

0.26 

3.35 

-3.11 

1.46 

2.77 

2.86 

1.44 

-0.18 

-1.49 

-0.56 

10.74 

17.32 

10.49 

2.57 

1.83 

1.23 

-3.05 

3.11 

10.72 

-24.13 

-5.05 

-6.62 

3.86 

1.10 

0.42 

-5.85 

-2.97 

14.47 

-11.60 

100.00 

22.01 

2.05 

-5.23 

-2.11 

-2.76 

-1.76 

-1.72 

-9.66 

-13.02 

9.92 

CO 

-1.18 

-0.18 

-0.80 

-4.97 

-4.00 

8.07 

34.45 

7.24 

-1.77 

-3.35 

0.31 

0.24 

-3.10 

-4.68 

5.30 

11.77 

2.95 

-2.97 

-2.48 

0.20 

0.67 

-0.78 

-1.99 

1.00 

1.79 

0.56 

-2.78 

-1.69 

-0.09 

(b) 
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4.  Restoration  Example:  Deblurring 


The  wavelet  quantization  noise  model  introduced  in  previous  sections  has  been  incorpo¬ 
rated  in  several  restoration  algorithms,  including  image  deblurring  and  multi- frame  motion 
imagery  restoration.  Although  Section  2  introduced  several  different  types  of  probability 
distribution  functions  that  could  be  used,  experiments  have  shown  that  the  Gaussian  noise 
model  that  makes  use  of  the  full  Kg,,  covariance  matrix  gives  the  best  restoration  results. 
In  order  of  increasing  performance,  the  following  distribution  types  have  been  investigated 
with  regard  to  their  restoration  effectiveness:  IID  Laplacian  (distribution  is  a  product  of 
univariate  Laplacian  distributions);  independent  Laplacian  distribution  with  variances  taken 
from  diag(Ke,,);  IID  Gaussian;  independent  Gaussian  with  variances  taken  from  diag(Ke,,); 
the  multivariate  Laplacian  described  by  (9);  and  the  Gaussian  noise  model  with  covariance 
matrix  Kg,,.  The  latter  two  of  these  distributions  give  nearly  identical  restoration  results, 
which  can  be  explained  by  the  similarity  in  the  gradients  of  the  log-likelihoods  of  the 
two  distributions.  Section  5,  which  provides  more-extensive  simulations  with  different  noise 
models  within  a  temporal  filtering  framework,  corroborates  this  informal  ranking  of  the  noise 
models’  accuracy. 

The  remainder  of  this  section  is  devoted  to  the  particular  restoration  case  of  deblurring 
wavelet-compressed  images.  Although  different  types  of  blurring  are  possible,  this  section 
will  focus  on  linear  motion  blurs.  A  brief  problem  formulation  will  be  presented,  followed 
by  experimental  results  that  demonstrate  the  advantage  of  using  the  Gaussian  quantization 
noise  model  compared  to  using  an  IID  Gaussian  noise  model.  Since  the  results  corresponding 
to  the  Laplace  noise  models  are  inferior  to  those  of  the  Gaussian  noise  models,  the  formula¬ 
tions  for  the  Laplacian  case  are  omitted;  nevertheless,  such  formulations  are  straightforward 
and  follow  very  closely  the  developments  that  follow. 

4.1.  Problem  Formulation 

If  the  original  image  is  denoted  as  z,  the  observation  model  for  the  blurred  image  z;,  can 
be  written  as 

Zb  =  Bz  +  na,  (11) 
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where  B  is  the  matrix  representing  the  motion  blur,  and  iia  represents  acquisition  noise  of 
the  imaging  system.  While  the  observation  vector  is  of  size  x  1,  the  original  image 
vector  z  is  of  size  M  x  1,  M  >  N ,  due  to  the  influence  of  image  content  beyond  the  border 
of  the  observed  image;  the  blurring  matrix  B  is  x  M.  We  do  not  include  any  nonlinearities 
in  the  observation  model  (due  to,  for  example,  the  response  characteristics  of  film  or  gamma 
processing),  although  alternative  formulations  that  include  them  are  possible.  Note  that 
there  are  a  number  of  methods  that  are  commonly  used  for  handling  the  boundary  conditions 
in  the  deblurring  problem,  including  zero-padding  (or  padding  with  some  other  constant) 
outside  the  boundary,  periodic  extension  beyond  the  boundary,  and  symmetric  extension 
beyond  the  boundary;  however,  while  some  of  these  alternatives  may  offer  computational 
incentive  for  their  use,  none  model  the  actual  blurring  process  as  accurately  as  the  method 
described  above  which  has  the  number  of  pixels  to  be  estimated  larger  than  the  number  of 
observations,  i.e.,  M  >  N. 

As  described  in  previous  sections,  the  image  z^  can  be  transformed  by  a  DWT,  quantized, 
and  then  inverse  transformed  back  to  the  spatial  domain  to  form  the  observed  image  Zg, 

Zg  Z^  -\-  02 

=  Bz -|- Ha  -|- 02,  (12) 

where  the  quantization  noise  0z  is  equivalent  to  the  noise  term  discussed  in  previous  sec¬ 
tions.  For  many  situations,  the  quantization  noise  02  will  dominate  the  acquisition  noise  !!„ 
(compression  has  a  tendency  to  smooth  out  and  remove  much  of  this  noise),  and  we  will 
approximate  (12)  as 

Zg  =  Bz-F02.  (13) 

Although  not  implemented  for  the  deblurring  case,  the  methods  of  Section  5.4  could  be  used 
to  model  explicitly  the  combined  effect  of  iia  and  02. 

It  is  well  known  that  care  must  be  taken  when  estimating  z  from  its  blurred  and  noisy 
observation  Zg,  for  otherwise  excessive  noise  amplification  can  result  [19].  To  avoid  such 
problems,  the  restoration  here  makes  use  of  a  prior  image  model  that  discourages  noise 
amplification,  and  solves  the  problem  in  a  Bayesian  framework.  In  particular,  the  image 
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Fig.  8.  Masks  used  for  construction  of  d^z.  For  example,  for  the  first  of  the  four  cliques  above,  d^z  =  —Z[m^n  —  1] 
+2Z[m,n]  —  Z[m,n  +  1],  where  Z[m,n]  is  the  element  of  the  image  represented  by  z. 


estimate  is  taken  as  the  maximum  a  posteriori  solution  that  maximizes  the  posterior  dis¬ 
tribution  of  z  given  which  is  equivalent  to  maximizing  p(z)p(zg|z).  The  likelihood  term 
p(z^|z)  is  found  by  combining  (6)  and  (13), 


exp  {-|(Bz  -  Z5)*KJ(Bz  -  z,)} 


(27r)^/2  |K, 


,1/2 


(14) 


Cz  I 


A  Markov  Random  Field  (MRF)  [20]  is  used  to  describe  the  prior  image  probability  p(z), 


p(z)  =  y  exp 


(15) 


where  V  is  a  normalizing  constant,  A  is  related  to  the  temperature  parameter  associated 
with  the  Gibbs  prior  distribution  [20],  c  are  indexes  of  local  groups  of  pixels  called  cliques, 
and  C  is  the  overall  set  of  such  cliques.  The  d*z  terms  form  spatial  activity  measures  which 
effectively  discourage  the  type  of  noise  amplihcation  possible  when  deblurring  images.  Here, 
the  d*z  terms  at  each  pixel  location  are  chosen  to  approximate  second-order  directional 
derivatives,  with  coefficients  as  shown  in  Figure  8.  The  function  Pt(')  controls  how  heavily 
the  spatial  activity  measures  are  penalized,  and  is  chosen  here  as 


Pr(b  —  s 


£2  |£|  <  T 

+  2T{\e\-T)  \e\>T 


(16) 


shown  graphically  in  Figure  9  for  T  =  4.  Such  a  choice  of  penalizing  function  results  in  a 
Huber-Markov  Random  Field  (HMRF),  which  has  been  used  with  much  success  in  other 
image  and  video  restoration  problems  [21]  [22].  The  Pt{-)  function  quadratically  penalizes 
values  less  than  or  equal  to  T,  while  linearly  penalizing  values  greater  than  T,  which  prevents 
sharp  image  discontinuities  like  edges  from  being  excessively  blurred  as  can  happen  when 
using  a  purely-quadratic  penalty  function.  For  further  detail  on  the  use  of  the  HMRF  in 
image  restoration,  the  reader  is  referred  to  [21]  [22]  [23]. 
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Fig.  9.  The  Huber  penalty  function  for  T  =  4  (solid)  compared  with  a  quadratic  penalty  function  (dotted). 


Since  maximization  of  the  posterior  distribution  is  equivalent  to  minimization  of  the 
negative  of  its  natural  logarithm,  the  image  estimate  can  be  written  as  the  solution  to 

z  =  argmm  ^(Bz  -  z,^)*K-hBz  -  z,^)  +  A  (d^z)  .  (17) 

The  solution  to  this  convex  minimization  problem  is  determined  iteratively  based  on  a 
gradient  descent  algorithm,  which  starts  with  an  initial  estimate  Z(o)  and  updates  the 
estimate  as 

zp+i)  =  Z(p  -  tt(i)g(p,  (18) 

where  gp)  is  the  gradient  of  the  objective  function  in  (17)  evaluated  at  zp),  and  ttp)  is  a  step 
size  which  controls  how  far  the  estimate  for  iteration  (i-\- 1)  moves  from  zp)  along  gp).  The 
initial  estimate  Z(o)  is  taken  as  z^  for  the  N  observed  pixels;  the  M  —  N  off-screen  pixels  of 
Z(o)  are  computed  by  blurring  z^  in  the  direction  opposite  of  the  blur  B. 

The  gradient  is  computed  from  (17)  as 

gp)  =  B*K;^^(Bzp)  -  z<j)  +  A  ^  dcp^dczp)) 

ceC 

=  B‘H'K,-‘H(Bz,.|-z,)  +  A^d,pqd;z|.)).  (19) 

ceC 
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For  real  unitary  discrete  transforms  (for  example,  the  DOT),  H*  =  and  the  term  H* 
above  is  easily  implemented  as  the  inverse  transform.  Since  all  DWT  matrices  of  practical 
interest  for  image  compression  are  not  unitary,  one  must  implement  a  new  transform  oper¬ 
ation  that  is  equivalent  to  Hh  Appendix  A  discusses  the  implementation  of  the  transpose 
of  the  DWT  matrix. 

The  step  size  to  be  used  in  conjunction  with  the  gradient  is  given  as 

_  S(i)S(i) 

OL{i)  —  F 


4) 


B*H*K 


HB  +  A5:d.p”(d'z(,|)d' 

ceC 


S(i) 


whose  derivation  is  given  in  Appendix  B.  Iterations  of  (18)  are  continued  until  the  conver¬ 
gence  criterion 

-OCzm'l 

<  £  (21) 


0\ 

(zp+i)^ 

)-0\ 

(z(b) 

0{ 

1 

is  met,  where  O(-)  represents  the  objective  function  in  (17)  and  e  is  a  small  threshold  that 
sets  the  rate  of  decrease  in  the  objective  function  that  is  required  for  convergence.  If  the 
number  of  iterations  exceeds  some  pre-determined  threshold  (taken  as  1500  here)  without 
convergence  according  to  (21),  the  algorithm  is  also  terminated. 


4..S.  Experimental  Results 

To  test  the  deblurring  algorithm,  the  512  x  512  grayscale  image  boat  image  is  first  blurred 
with  a  filter  B  that  simulates  the  effect  of  the  spatial  and  temporal  integration  of  a  CCD 
image  array  when  the  relative  motion  between  the  scene  and  camera  over  the  integration 
time  of  the  sensor  has  a  distance  of  5.7  pixels  at  angle  — tt/O,  where  the  velocity  is  constant. 
The  central  384  x  384  pixels  of  the  result  are  then  taken  as  the  blurred  image.  (Using  the 
central  region  like  this  better  simulates  real  data,  because  these  384  x  384  pixels  contain 
information  from  other  pixels  outside  of  the  sub-image,  as  would  be  expected  from  a  true 
camera.)  The  coefficients  of  a  four-level  384  x  384  Daubechies  9/7  wavelet  decomposition 
are  quantized  as  described  in  Section  3,  and  the  final  observed  image  is  taken  as  the  inverse 
wavelet  transform  of  the  dequantized  coefficients;  reconstruction  levels  are  taken  as  the 
midpoints  of  the  quantization  cells.  The  PSNR  of  the  de-compressed  blurred  image  relative 
to  the  uncompressed  blurred  image  is  42.8  dB. 
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Iteration 


Fig.  10.  Plots  of  PSNR  vs.  iterations  for  deblurring.  Examples  are  given  for  maximum  likelihood  restorations 
(A  =  0.0),  for  both  IID  and  the  proposed  noise  models;  and  for  MAP  restorations  (A  >  0.0),  for  both  IID  and  the 
proposed  noise  model. 


Two  methods  of  deblurring  are  presented.  The  first  method  uses  the  wavelet  quantization 
noise  model  as  described  in  Section  4.1,  while  the  second  method  replaces  the  noise  covari¬ 
ance  term  in  (17)  with  a  diagonal  matrix  consisting  of  elements  12/A^,  where  was 
introduced  in  Section  3.  The  latter  of  these  two  methods  implies  independent  and  identically 
distributed  Gaussian  noise,  which  for  the  case  at  hand  leads  to  a  deblurring  estimate  that 
makes  use  of  (18)  and  modified  versions  of  (19)  and  (20),  omitted  here  for  brevity.  For  the 
IID  model,  variances  of  Ayi2  provide  rough  estimates  of  the  spatial-domain  quantization 
noise.  For  all  cases  reported  here,  the  Huber  parameter  was  chosen  as  T  =  4.0. 

The  quality  metric  used  here  is  PSNR,  defined  as  101ogio(255^/MS'£'),  where  MSE  is 
the  mean-squared  error  between  a  reconstruction  and  the  reference  image.  Figure  10  plots 
PSNR  as  a  function  of  iteration  for  several  values  of  A,  including  the  case  of  A  =  0.0 
which  removes  the  regularizing  smoothness  term  from  the  formulation  to  form  a  maximum 
likelihood  estimate.  From  the  figure,  it  is  readily  evident  that  there  are  two  PSNR  values 
which  could  be  used  for  comparing  the  two  methods — the  converged  value,  PSNRc,  which  is 
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Fig.  11.  Plots  of  A  vs.  PSNR  for  deblurring,  showing  both  the  PSNR  at  the  peak  of  the  plots  from  Figure  10  and 
at  the  converged  solutions. 


the  PSNR  achieved  after  1500  iterations  or  convergence;  and  the  maximum  value,  PSNR^, 
which  is  the  maximum  PSNR  achieved  and  may  or  may  not  be  equal  to  PSNR^.  In  an  actual 
deblurring  scenario,  it  is  impossible  to  know  exactly  when  the  maximum  PSNR  is  achieved, 
but  careful  monitoring  of  deblurring  progress  would  allow  a  user  to  terminate  an  algorithm 
when  the  reconstruction  appears  best.  Thus  PSNR^  is  a  valid  measure,  although  a  hands-off 
reconstruction  (where  the  estimate  is  taken  at  algorithm  convergence,  hence  using  PSNRc) 
would  be  preferred. 

Figure  11  plots  the  two  quality  measures  PSNRc  and  PSNR^  as  a  function  of  A.  The  best 
reconstruction  of  the  model  making  use  of  the  quantization  noise  model  is  about  0.26  dB 
better  than  the  best  reconstruction  using  the  IID  noise  model.  Results  for  other  test  images 
under  similar  conditions  are  shown  in  Table  3.  While  these  improvements  are  not  large,  they 
do  demonstrate  that  using  the  quantization  noise  model  consistently  provides  reconstructions 
that  are  mathematically  closer  to  the  original  than  when  using  the  IID  noise  model.  For 
all  simulations  performed,  including  those  not  reported  in  Table  3,  PSNR  values  of  the 
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TABLE  3 


PSNR  IMPROVEMENTS  FOR  RESTORATION  USING  QUANTIZATION  NOISE  MODEL  VERSUS  RESTORATION  ASSUMING  IID 

NOISE  MODEL. 


sequence 

Compressed 

PSNR 

Motion 

Length 

Motion 

Angle 

APSNRc 

APSNRc 

barb 

46.4 

7.5 

7r/7 

0.48 

0.36 

boat 

43.2 

9.0 

7r/4 

0.34 

0.34 

boat 

42.8 

5.7 

— 7r/9 

0.26 

0.26 

goldhill 

42.6 

10.0 

— 7r/6 

0.22 

0.21 

goldhill 

41.2 

5.0 

— 7r/3 

0.19 

0.19 

lena 

41.8 

11.1 

7r/5 

0.37 

0.37 

mandrill 

38.4 

14.2 

7r/5 

0.16 

0.14 

mandrill 

39.0 

4.1 

-47r/5 

0.22 

0.23 

peppers 

46.2 

12.1 

47r/5 

0.36 

0.36 

reconstructions  were  higher  when  using  the  quantization  noise  model  than  when  using  the 
IID  noise  model.  Since  the  only  difference  between  the  two  restoration  methods  is  in  their 
noise  characterization,  it  stands  to  reason  that  the  quantization  noise  model  of  Section  2 
provides  a  more  accurate  description  of  the  compression  error  than  does  an  IID  noise  model. 

Visual  results  of  the  reconstructions  that  use  the  quantization  noise  model  are  given  in 
Figure  12.  Part  (c)  of  the  figure  shows  the  reconstruction  for  A  =  0.0,  which  corresponds 
to  the  maximum  likelihood  solution;  part  (d)  shows  the  MAP  reconstruction.  For  both 
reconstructions  shown,  the  estimate  is  taken  at  the  iteration  that  maximizes  PSNR.  The 
ML  estimate  was  thus  taken  at  the  peak  of  the  ML  plot  shown  in  Figure  10,  while  the  MAP 
estimate  had  its  highest  PSNR  at  convergence.  Superiority  of  the  MAP  estimate  to  the  ML 
estimate  is  apparent  in  the  figure,  which  demonstrates  the  noise  suppression  introduced  by 
the  prior  model.  Distinguishing  between  the  reconstructions  of  the  IID  noise  model  and  the 
quantization  noise  model  is  difficult  on  the  printed  page,  and  thus  visual  results  for  the  IID 
noise  assumption  are  not  presented  here. 

As  a  side  note,  the  reconstructions  in  the  figure  have  additional  spatial  extent  than  the 
384  X  384  original  (due  to  the  spatial  blurring  at  the  image  boundaries  with  pixels  outside  of 
the  field  of  view),  which  may  or  may  not  be  important  to  a  user  of  a  deblurring  application;  in 
either  case,  using  additional  pixels  outside  of  the  boundary  helps  to  prevent  excessive  ringing 
near  the  borders,  as  might  be  expected  when  using  the  less-accurate  boundary  conditions 


24 


Fig.  12.  Visual  deblurring  results  when  using  the  quantization  noise  model.  The  original  image  is  shown  in  (a),  with 
the  blurred  and  compressed  image  shown  in  (b).  The  PSNR  of  the  blurred  and  compressed  image  in  (b)  relative  to 
the  blurred  but  uncompressed  image  is  43.2  dB,  while  the  PSNR  of  the  image  in  (b)  relative  to  the  image  in  (a)  is 
20.0  dB.  Results  are  all  taken  at  the  point  of  best  reconstructed  PSNR.  (c)  Maximum  likelihood  estimate,  A  =  0.0, 
PSNR^  =  27.34  dB;  (d)  MAP  estimate,  A  =  1.5  x  10“^  PSNR^  =  28.57  db. 


mentioned  previously.  Although  not  reported  in  Table  3,  it  is  interesting  to  note  that  using 
the  Huber  penalty  function  pr(')  gives  significant  improvement  relative  to  using  a  quadratic 
penalty  function^  corresponding  to  a  Gauss-Mar kov  Random  Field  (GMRF).  Discounting 
the  noise  model,  the  HMRF  gave  up  to  1.2  dB  improvement  relative  to  the  GMRF  for  the 


quadratic  penalty  function  corresponds  to  allowing  the  Huber  parameter  T  to  become  arbitrarily  large. 
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cases  presented  above. 
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5.  Restoration  Example:  Temporal  Filtering 


This  section  demonstrates  various  methods  of  restoring  motion  imagery  compressed  by 
scalar  quantization  of  both  2D  and  3D  wavelet  coefficients.  In  both  cases,  the  data  of  interest 
are  motion  imagery,  i.e.,  they  are  sequences  of  images  indexed  by  time.  Filtering  along 
the  time  domain  allows  the  incorporation  of  information  from  surrounding  frames  for  the 
restoration  of  the  current  frame.  Such  ideas  have  been  used  in  the  super-resolution  literature 
(for  example,  in  [22]),  where  the  goal  is  to  improve  the  spatial  resolution  by  using  information 
from  surrounding  frames;  here,  however,  the  goal  is  to  improve  the  quality  of  a  frame  without 
changing  resolution.  Several  examples  of  temporal  filtering  in  this  manner  can  be  found 
in  the  literature  [24]  [25]  [26]  [27].  The  section  begins  by  formulating  the  general  framework 
for  temporal  filtering,  and  continues  by  examining  temporal  filtering  when  using  several 
different  variations  of  the  quantization  noise  model  discussed  in  this  report.  The  first  two 
subsections  provide  results  for  synthetically-generated  video  whose  individual  frames  have 
been  compressed  by  using  the  2D  DWT.  Subsection  5.3  extends  these  results  to  the  case 
of  compression  with  a  3D  DWT.  Subsection  5.4  considers  the  case  of  actual  video,  showing 
that  benefits  of  using  the  proposed  quantization  noise  model  apply  for  real  video  as  well  the 
synthetic  sequences  used  in  the  first  three  subsections. 

Suppose  the  original  image  at  time  fc  is  z^.  Since  a  temporal  filter  uses  pixel  information 
from  frames  at  different  time  instants,  one  must  first  describe  the  relationships  between  the 
images  at  different  times.  One  common  method  of  relating  images  at  two  time  instants  k 
and  I  is  to  model  them  such  that 


=  Ak,iz'^  +  +  e'^’\  (22) 

where  the  matrix  Ak^i  forms  a  prediction  of  given  z^,  is  the  error  in  such  a  prediction, 
and  is  a  “mean”  term  that  accounts  for  pixels  in  z^  that  are  unobservable  from  z^.  Note 
that  when  I  =  k,  A^^k  =  I)  =  0,  and  =  0.  When  the  original  images  are  compressed, 
the  model  that  relates  the  observation  at  time  I  to  the  original  image  at  time  k  must  be 
modified  according  to  the  quantization  error, 

z'  +  +  +  (23) 
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where  the  additional  noise  term  is  the  same  quantization  error  term  introduced  in  Section  2. 
For  notational  convenience,  the  two  noise  terms  can  be  combined  into  a  single  noise  term, 


z'  =  +  n‘ 


(24) 


The  error  is  assumed  to  be  normally  distributed,  with  mean  0  and  covariance  K.k,h  which 
yields  the  probability  distribution  function  of  given  z^. 


pKIz'')  = 


1 


(2^)«/2|Ky|l/2 


exp 


1 


(A,.,z‘  +  -  4)}  .  (25) 


With  the  simplifying  approximation  that  z^|z^  and  z^jz''  are  independent  for  m  I,  the 


m  I  rrk 


joint  pdf  of  observations  of  compressed  images  at  times  fc  —  n, . . . ,  A:  +  n  is 

k-\-n 


(26) 


The  temporal  filters  described  in  this  section  all  find  maximum  likelihood  estimates  of  a 
single  frame  z^  given  degraded  observations  of  the  sequence  at  time  instants  within  in 
of  A:,  i.e.,  k  —  n, . . . ,  k  +  n.  The  maximum  likelihood  estimator  chooses  an  estimate  for  z^ 
that  maximizes  the  likelihood  term  in  (26).  Since  maximizing  a  function  is  equivalent  to 
minimizing  the  negative  of  its  natural  logarithm,  the  ML  estimate  can  be  written  as 


k-\-n 

it  =  argmin  ^  {Kk,iz^  +  -  z' )*K^)(Afc,;z'=  +  m"’'  -  z' ).  (27) 

TT  1  1  ’ 

L=k—n 

The  various  temporal  filtering  schemes  presented  in  this  section  differ  in  their  dimensionality 
(two-  or  three-dimensional  DWT)  as  well  as  their  choice  of  covariance  matrix  but  the 
general  problem  setup  is  that  of  (27).  Equation  (27)  is  solved  using  an  iterative  conjugate 
gradient  optimization  algorithm,  details  of  which  are  given  in  Appendix  C.  Note,  however, 
that  some  of  the  simplified  noise  terms  lead  to  simplifications  that  allow  almost  trivial 
implementations  that  are  not  iterative;  such  simplifications  will  be  discussed  as  they  are 
introduced  in  later  subsections. 

Elements  of  the  mean  term  are  chosen  as 


k,l  _ 


0  for  observable  pixels 


z^  for  unobservable  pixels 


(28) 


For  pixels  at  time  I  that  do  not  correspond  to  any  pixels  at  time  A;,  the  corresponding 
rows  of  Aj.  I  consist  entirely  of  zeros.  To  classify  a  pixel  at  time  I  as  having  or  not  having 
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corresponding  pixels  at  time  k,  a  simple  threshold  can  be  applied  to  the  difference  |Zg— 
after  motion  estimation;  knowledge  of  pixels  appearing  or  disappearing  at  image  edges  due 
to  camera  motion  can  also  be  used.  More  sophisticated  algorithms  could  be  used  to  estimate 
both  Ak^i  and  simultaneously,  although  they  are  not  explored  here. 

The  following  subsections  discuss  various  methods  and  experiments  of  temporally  hltering 
wavelet-compressed  motion  imagery. 

5.1.  Experiment  1 

Consider  the  case  where  each  frame  of  an  image  sequence  is  compressed  by  use  of  a 
2D  DWT,  independently  of  the  other  frames.  This  subsection  considers  the  simplified  case 
where  both  the  horizontal  and  vertical  motion  between  frames  is  an  integer  number  of  pixels. 
The  simplest  of  temporal  filters  assumes  Gaussian  noise  with  covariance  matrix  that 
is  diagonal  with  elements  for  each  I  =  k  —  n, . . .  ,k  +  n.  Such  a  filter  equally  weights  all 
observations  of  each  frame  at  each  time  instant,  and  can  be  loosely  interpreted  as  a  box 
filter  in  the  temporal  dimension  along  the  motion  trajectories  described  by  the  ^  matrices. 
Another  simple  filter  assumes  IID  Laplace  noise,  and  results  in  a  median  filter  along  the 
motion  trajectories.  More  accurate  versions  of  these  two  filters  can  be  achieved  by  using  the 
same  noise  type  (i.e.,  independent  Gaussian  or  Laplacian),  but  assigning  =  diag(Ke^). 
For  the  Gaussian  case,  this  leads  to  a  weighted-average  filter  along  motion  trajectories  such 
that  pixels  that  are  more  accurate  according  to  the  quantization  noise  model  are  weighted 
more  heavily  than  pixels  that  are  less  accurate.  Similarly,  the  Laplace  case  leads  to  a 
weighted-median  filter  along  motion  trajectories.  The  fifth  and  final  case  considered  here  is 
to  use  Gaussian  noise  with  which  cannot  be  implemented  by  simple  averaging  or 

median  filtering,  but  is  instead  implemented  with  the  conjugate  gradient  method  discussed 
in  Appendix  G.  All  five  of  these  noise  models  for  K.k,i  consider  only  the  noise  introduced  by 
compression. 

The  initial  experiment  given  in  this  subsection  is  designed  to  isolate  the  effect  of  the 
quantization  noise  model.  The  input  sequence  consists  of  five  frames,  where  each  frame  is 
taken  from  a  single  source  image  and  globally  translated  with  integer  shift;  thus  the  only 
difference  between  these  five  images  is  that  they  are  global  translates  of  each  other,  where 
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the  global  motion  parameters  are  known.  In  a  very  simple  sense,  these  five  images  form  an 
artificially-constructed  “image  sequence.”  Obviously,  sequences  with  such  a  property  will 
almost  never  occur  in  actual  video;  however,  the  point  of  this  snbsection  is  to  isolate  the 
effect  of  the  quantization  noise  model,  and  the  method  discussed  here  eliminates  the  influence 
of  other  motion-related  error  such  as  or  uncertainty  in  the  matrices  Subsection  5.4 
will  relax  the  constraints  imposed  here  so  as  to  filter  actual  video. 

The  integer  shifts  that  relate  the  five  frames  are  (0,0),  (1,0),  (0, 1),  (—1,0),  and  (0,  — 1). 
The  compressed  images  are  formed  by  quantization  of  four-level  DWT  decompositions  of 
the  images,  where  the  quantization  is  performed  as  described  by  (10).  The  frames  used  for 
compression  are  384  x  384  portions  of  the  original  larger  image,  which  allows  the  shifted 
frames  to  be  extracted  without  missing  pixels  at  the  image  borders.  Figure  13  shows  the 
original  test-pattern  image  along  with  the  image  to  be  restored,  the  (0,  0)-shifted  frame. 
Althongh  only  the  compressed  image  corresponding  to  (0,  0)  translation  is  shown  in  the 
figure,  the  other  four  compressed  images  appear  approximately  the  same;  all  five  compressed 
images  for  test-pattern  have  PSNR  in  the  range  32.84  ±  0.06  dB. 

Figure  14  compares  the  restoration  results  of  test-pattern  for  the  IID  Gaussian  noise  model 
and  for  the  proposed  noise  model.  As  can  be  seen  by  comparing  Figures  13  and  14,  even  the 
IID  Gaussian  noise  model  yields  considerable  improvement  over  the  originally-compressed 
image;  this,  however,  is  to  be  expected  due  to  the  contrived  nature  of  the  experiment.  The 
important  lesson  to  be  learned  from  Figure  14  is  not  the  improvement  relative  to  the  image 
in  Figure  13(b),  but  rather  the  improvement  of  the  images  in  Figure  14  (b)  and  (d)  relative 
to  those  in  parts  (a)  and  (c) — using  the  quantization  noise  model  proposed  in  this  technical 
report  provides  over  1  dB  of  improvement  in  PSNR  relative  to  using  an  IID  Gaussian  noise 
model,  and,  most  importantly,  there  is  a  significant  visible  improvement  as  well.  Table  4 
summarizes  results  for  other  test  images  using  all  five  quantization  noise  models  introduced 
at  the  beginning  of  this  subsection;  the  384  x  384  portions  of  each  of  these  test  images  are 
shown  in  Figure  15.  As  can  be  seen  from  Table  4,  using  the  full  quantization  error  covariance 
matrix  with  a  Gaussian  pdf  produces  PSNR  results  that  are  significantly  better  than  those 
produced  when  using  IID  distributions  or  when  using  reduced  versions  of  the  full  covariance 
matrix,  and  is  true  for  either  Laplace  or  Gaussian  distributions. 
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Fig.  13.  Original  and  compressed  images  for  the  first  temporal  filtering  experiment:  (a)  original;  (b)  compressed 
image  0,  PSNR=32.80  dB;  (c)  close-up  of  original;  (d)  close-up  of  compressed  image. 
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Fig.  14.  Comparison  of  restorations  of  the  compressed  image  in  Figure  13  using  IID  non-IID  Gaussian  noise  models, 
(a)  Restoration  using  IID  Gaussian  noise  model,  PSNR=35.10  dB;  (b)  restoration  using  Gaussian  noise  model  with 
covariance  matrices  ,  PSNR=36.14  dB;  (c)  close-up  of  IID  Gaussian  case;  (d)  close-up  of  non-IID  Gaussian  noise 
case. 
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TABLE  4 


PSNR  VALUES  FOR  RESTORATION  USING  VAROUS  QUANTIZATION  NOISE  MODELS.  THE  RANGE  OF  PSNR’S  UNDER 

“Compressed  images”  refers  to  the  PSNR’s  for  the  five  gompressed  input  images.  All  PSNR  values 

ARE  IN  DEGIBELS. 


Sequence 

name 

Compressed 

images 

Laplacian, 

IID 

Laplacian, 

diaglKg^) 

Gaussian, 

IID 

Gaussian, 

diag(Kg^) 

Gaussian, 

harh 

32.86 

0.02 

35.18 

35.50 

36.00 

36.31 

37.15 

boat 

30.49 

± 

0.03 

32.06 

32,24 

32.65 

32.84 

33.71 

mandrill 

32.29 

0.01 

34.79 

35.04 

35.52 

35.87 

36.83 

peppers 

32.50 

± 

0.03 

33.48 

33.55 

33.81 

33.90 

34.37 

test-pattern 

31.10 

0.06 

32.42 

32.61 

33.26 

33.42 

34.24 

test-pattern 

32.84 

± 

0.06 

34.30 

34.51 

35.10 

35.29 

36.14 

test-pattern 

34.73 

0.07 

36.39 

36.62 

37.10 

37.31 

38.09 

Fig.  15.  Original  input  images,  (a)  mandrill;  (b)  boat;  (c)  peppers;  (d)  barb. 


The  five  test  images  used  here  contain  various  characteristics,  from  the  text  contained  in 
the  resolution  chart  of  test-pattern,  to  the  texture  of  mandriirs  fur  or  barb’s  clothing,  to 
the  smooth  regions  contained  in  areas  of  each  of  the  images.  Although  the  results  presented 
in  Table  4  do  not  contain  an  exhaustive  comparison  between  images  at  different  compres¬ 
sion  qualities,  the  PSNR  improvements  evident  in  the  table  are  representative  of  PSNR 
improvements  that  have  been  observed  over  a  wide  range  of  compression  severities. 

5.2.  Experiment  2 

The  previous  subsection  demonstrated  the  potential  improvement  of  using  the  proposed 
quantization  noise  model  relative  to  using  simple  IID  Gaussian  or  Laplacian  noise  models. 
Both  the  proposed  noise  model  and  the  IID  noise  models  provided  significant  improvements 
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relative  to  the  compressed  images,  with  the  proposed  noise  model  giving  approximately  an 
additional  dB  of  improvement  over  the  IID  Gaussian  model.  The  experiment  presented  in 
this  subsection  will  demonstrate  that  the  improvements  that  one  gets,  for  each  of  the  noise 
models,  is  quite  dependent  on  the  actual  motion  between  the  frames. 

Consider  an  experiment  much  like  the  one  of  the  previous  subsection,  in  which  original 
frames  of  a  synthetic  video  sequence  are  formed  by  taking  384  x  384  subimages  from  an 
original  512  x  512  image;  each  of  the  frames  are  offset  by  an  integer  global  translation.  In 
this  subsection,  the  frame  to  be  restored  is  considered  as  having  global  translation  of  (0,  0), 
and  eight  other  frames  are  formed  by  taking  shifts  of  ±6  pixels,  i.e.,  (6,0),  (6,6),  (0,6), 
(—6,6),  (—6,0),  (—6,  —6),  (0,  —6),  and  (6,  —6).  Table  5  compares  PSNR  improvements  as  a 
function  of  6  for  Gaussian  noise  models  making  use  of  and  the  IID  noise  assumption; 
the  mandrill  image  is  used  for  testing.  For  this  simple  restoration  example,  the  PSNR 
improvements  obviously  depend  on  the  motion  that  occurs  between  frames — odd  pixel 
displacements  consistently  yield  better  results  than  even  displacements.  These  results  in 
the  table  are  quite  interesting;  for  each  value  of  6,  the  input  frames  all  have  (approximately) 
the  same  PSNR.  Thus,  although  in  each  case  of  6  the  input  images  are  shifted  versions  of 
each  other  with  equivalent  PSNR’s,  there  are  drastic  differences  in  PSNR  results  when  the 
samples  are  averaged.  Such  behavior  is  easily  explained:  Due  to  the  subsampling  employed 
by  a  non-expansive  discrete  wavelet  transform,  shifts  of  6  =  2  among  frames  result  in 
the  highest-frequency  subbands’  being  exact  shifts  of  each  other.  Since  the  quantization 
parameters  are  unchanged  between  images,  the  quantized  DWT  coefficients  in  these  highest- 
frequency  subbands  are  identical  among  the  different  images,  and  hence  no  new  information 
is  introduced  in  the  highest  subbands  of  the  shifted  image  observations.  Any  gains  for 
6  =  2  are  due  entirely  to  lower-frequency  subbands.  Similarly,  for  the  case  of  6  =  4  not 
only  are  the  highest-frequency  subbands  exact  shifts  of  each  other,  but  so  are  the  second- 
highest-frequency  subbands;  hence  the  PSNR  improvements  for  6  =  4  are  worse  than  for 
6  =  2.  The  trend  continues  for  6  =  8,  and  ultimately  one  can  conclude  that  for  a  lev- 
level  DWT  decomposition,  shifts  of  ±n2^®^,  v  an  integer,  there  will  be  zero  gain  by  hltering 
multiple  observations;  such  is  the  case  regardless  of  the  noise  model  one  uses.  An  analogous 
phenomenon  occurs  when  a  compression  technique  makes  use  of  the  block  DCT:  If  the  block 
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TABLE  5 


Dependence  of  filtering  results  on  integer  shifts  b  for  the  mandrill  image.  Here,  frame  0  is  being 

RESTORED  BASED  ON  THE  NINE  TOTAL  OBSERVATIONS  THAT  ARE  WITHIN  ±6  OF  A  SHIFT  OF  (0,  0) .  QUANTIZED 

INPUT  FRAMES  ALL  HAVE  PSNR  OF  34.1  ±  0.05  DB. 


h 

APSNR, 
Gaussian  IID 

APSNR, 

Gaussian 

1 

3.93 

5.69 

2 

0.80 

0.97 

3 

3.94 

5.71 

4 

0.25 

0.30 

5 

3.94 

5.71 

6 

0.80 

0.97 

7 

3.93 

5.69 

8 

0.14 

0.17 

size  is,  for  example,  8x8,  there  is  zero  gain  achieved  by  temporal  filtering  when  the  shifts 
among  the  images  are  integer  multiples  of  8. 

These  results  suggest  that  one  should  not  forget  that  the  quantization  noise  model  is 
merely  what  its  name  suggests — a  model.  True  quantization  error  is  not  a  random  process, 
but  rather  a  deterministic  and  repeatable  quantity;  compressing  the  same  image  at  two 
different  times  produces  quantization  errors  that  are  identical  for  the  two  images.  However, 
as  discussed  in  Section  2  the  quantization  noise  model  can  provide  a  foundation  upon  which 
restoration  algorithms  can  be  built,  and  although  the  drastic  results  of  Table  5  demonstrate  a 
limitation  of  the  model,  the  pathological  conditions  of  this  subsection’s  experiment  (namely, 
that  all  frames  differ  in  global  translation  by  a  constant  integer  shift)  rarely  occur  in  natural 
video.  Subsection  5.4  presents  results  for  actual  video  that  demonstrate  the  quantization 
noise  model’s  utility  in  temporal  filtering. 

5.3.  Experiment  3 

Here,  we  repeat  the  experiment  of  Subsection  5.1  by  replacing  the  two-dimensional  DWT 
by  the  three-dimensional  DWT.  Such  a  situation  was  discussed  in  Section  3,  where  it  was 
demonstrated  that  pixel  errors’  statistical  behavior  varies  both  spatially  and  temporally. 
Here,  three  noise  models  will  be  compared,  all  of  which  are  Gaussian:  Using  the  full 
using  an  HD  model;  and  using  independent  noise  with  variances  that  are  constant  within 
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TABLE  6 


Restoration  results  eor  compression  that  uses  a  three-dimensional  DWT.  The  PSNR  listed  under 
“Quantized”  is  the  highest  oe  the  PSNR’s  for  the  16  input  images,  and  corresponds  to  the  PSNR  at 

IMAGE  8  (compare  WITH  IMAGE  8  OF  FIGURE  2).  ThE  NOISE  MODEL  “GAUSSIAN  IND”  IS  AN  INDEPENDENT 

Gaussian  noise  model  that  assumes  each  received  frame  has  a  constant  error  variance  as  predicted 

IN  Figure  2. 


Sequence 

name 

1  PSNR,  dB 

Quantized 

Gaussian  IID 

Gaussian  Ind 

Gaussian 

barb 

36.27 

40.19 

40.29 

42.38 

boat 

34.01 

36.69 

36.79 

39.17 

mandrill 

31.30 

35.34 

35.37 

38.50 

peppers 

35.35 

36.77 

36.79 

37.90 

test-pattern 

36.24 

39.15 

39.21 

41.14 

a  frame,  but  vary  between  frames  in  a  manner  predicted  by  the  plot  in  Figure  2.  As  was 
the  case  in  Figure  2,  the  length  of  the  transform  in  the  temporal  direction  is  sixteen;  the 
image  sequence  is  synthesized  by  shifts  applied  to  a  prototype  image  of  (0,0),  (1,0),  (1, 1), 
and  continuing  in  an  outward  counter-clockwise  spiral  to  the  final  shift  of  (—1,2).  All  of  the 
sixteen  frames  in  this  group  of  pictures  are  filtered  to  produce  the  estimate  of  the  original 
image. 

Quantitative  results  for  the  experiment  are  shown  in  Table  6.  Significant  PSNR  improve¬ 
ments  are  evident  relative  to  the  received  noisy  images,  but  this  is  to  be  expected  since  the 
filtering  is  able  to  make  use  of  sixteen  noisy  versions  of  the  exact  same  original  frame;  in 
real-life  situations,  rarely  will  such  fortuitous  circumstances  arise.  The  important  conclusions 
to  be  drawn  from  the  results  are  not  in  regard  to  PSNR  improvement  relative  to  the  noisy 
images,  but  rather  the  PSNR  improvements  of  the  noise  models  relative  to  those  of  the  IID 
Gaussian  noise  model.  One  important  result  is  that  relative  to  the  IID  noise  model’s  results, 
there  is  practically  nothing  to  be  gained  in  temporal  filtering  by  making  use  of  the  frame-to- 
frame  error  variances  that  were  demonstrated  in  Figure  2;  it  seems  that  the  pixel- wise  noise 
characteristics  are  considerably  more  important  than  the  frame- wise  noise  characteristics. 
The  second  important  result  to  be  drawn  from  the  table  is  the  significant  gain  of  using  the 
full  theoretic  covariance  matrix  relative  to  the  other  noise  models. 
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5.4--  Experiment  with  Real  Video 


While  the  previous  three  subsections  dealt  with  synthetically- generated  video,  such  sit¬ 
uations  are  not  necessarily  indicative  of  the  results  one  would  obtain  with  real  video  se¬ 
quences.  This  subsection  considers  temporal  filtering  of  actual  motion  imagery  that  has 
been  compressed  using  the  2D  DWT.  We  will  consider  two  cases:  When  the  observation 
error  is  dominated  by  quantization  noise,  and  hence  the  motion-compensation  error  can 
be  neglected;  and  when  is  included  in  the  formulation.  For  each  of  the  two  cases,  two 
models  are  used  for  the  quantization  error — IID  Gaussian  noise  with  covariance  matrix  of 
(j^I,  and  Gaussian  noise  with  covariance 

In  previous  subsections,  the  motion  between  input  frames  was  known  exactly  because  of 
the  artificial  construction  of  the  sequences.  With  real  video,  the  motion  must  be  estimated. 
Here,  we  consider  video  sequences  that  contain  frames  that  differ  by  a  global  transformation, 
e.g.,  stationary  scenes  captured  by  a  moving  camera  that  is  located  at  a  far  distance  from  the 
actual  scene,  as  might  be  expected  from  aerial  surveillance  video.  Stationary  scenes  are  not 
a  requirement,  but  they  make  motion  estimation  simpler;  with  more  sophisticated  motion 
estimation,  the  algorithm  described  here  could  be  applied  equally  well.  To  register  frames  of 
the  input  imagery,  an  affine  motion  model  is  employed.  The  six  parameters  of  the  affine  modef 
that  refate  the  two  frames  are  estimated  iterativefy  within  a  coarse-to-fine  muftiresofution 
pyramid.  Note  that  white  previous  sections  assumed  integer  pixel  motions,  the  more  general 
model  here  allows  for  floating-point  pixel  motions;  ^  matrices  are  constructed  based  on 
bilinear  interpolation  for  these  fractional  pixel  motions. 

Figure  16  shows  five  frames  of  the  stiekers  sequence;  the  author  acquired  this  uncom¬ 
pressed  sequence  using  a  Pixelink  PL-A641  monochrome  camera.  While  the  sequence  is 
certainly  not  the  same  as  aerial  surveillance  video,  it  does  share  certain  qualities — a  large 
global- motion  component,  as  well  as  significant  detail  at  fine  resolutions.  The  writing  on  the 
stickers  will  serve  as  a  sort  of  resolution  chart  for  comparison  of  the  restoration  algorithms. 
For  this  example,  the  compressed  versions  of  these  five  frames  will  be  used  to  reconstruct 
image  2. 

Results  are  first  presented  for  two  compression  qualities  under  the  assumption  that  the 
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Fig.  16.  Original  640  x  480  input  images  of  the  stickers  sequence,  numbered  from  image  0  to  image  4. 


quantization  error  dominates  the  overall  error  n^’t  As  in  previous  subsections,  the  com¬ 
pression  is  simulated  by  scalar  quantization  of  the  coelRcients  from  a  four-level  DWT  de¬ 
composition.  Figure  17  presents  results  for  images  compressed  to  a  relatively  low  quality 
of  approximately  32.3  dB,  while  Figure  18  presents  results  for  images  compressed  at  higher 
quality  of  about  36.5  dB.  Each  figure  contains  the  compressed  middle  frame  of  the  five  input 
images,  along  with  restorations  using  both  the  full  quantization  noise  model  and  the  IID 
noise  model. 

In  Figure  17,  both  the  quantization  noise  model  and  the  IID  noise  model  show  considerable 
improvement  over  the  original  compressed  image,  which  shows  the  potential  advantages  of 
temporal  filtering  for  this  type  of  compressed  sequence.  The  visual  improvement  of  the 
quantization  noise  model  relative  to  the  IID  noise  model  is  evidenced  by  an  overall  sharper 
reconstruction,  with  more  apparent  contrast.  The  PSNR  is  also  slightly  higher  for  the 
restoration  using  the  quantization  noise  model. 

For  the  higher  quality  compressed  image  set  of  Figure  18,  visual  distinction  among  the 
images  is  not  as  clear.  There  are  some  improvements,  both  in  visual  quality  and  PSNR,  for 
the  restorations  compared  to  the  compressed  image;  for  example,  the  “California”  sticker,  or 
the  “Go  Bananas”  sticker.  Although  difficult  to  discern  on  the  printed  page,  the  restoration 
for  the  quantization  noise  model  is  slightly  sharper  than  that  of  the  IID  noise  model;  the  “Go 
Bananas”  sticker  is  arguably  more  legible  for  the  restoration  of  the  quantization  noise  model. 
Unfortunately,  the  additional  sharpness  for  this  higher-quality  case  comes  at  the  cost  of  a 
sharpening  of  noise  as  well,  which  accounts  for  the  slightly  lower  PSNR  for  the  restoration 
of  the  quantization  noise  model.  In  this  case,  noise  being  sharpened  is  not  compression 
noise  e^,  but  rather  the  motion  compensation  noise  Recall  that  the  restorations  of  this 
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Fig.  17.  Restoration  results  for  real  video  data.  The  five  input  images  have  PSNR  values  in  the  range  of  32.2  dB 
to  32.4  dB.  The  left  column  shows  the  full  640  x  480  images,  while  the  right  column  shows  zoomed  portions  of 
the  images.  (a,b):  compressed  image  2,  PSNR=32.23  dB;  (c,d):  restored  image  2  using  quantization  noise  model, 
PSNR=34.91  dB;  (e,f):  restored  image  2  using  IID  noise  model,  PSNR=34.58  dB. 
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section  thus  far  have  not  included  the  motion  compensation  noise,  but  rather  have  assumed 
that  the  quantization  noise  dominates.  Since  the  quantization  noise  covariance  matrix  is 
non-diagonal,  and  hence  includes  correlations  among  the  errors,  an  overall  sharpening  can 
result  from  the  filtering.  For  high-quality  compressed  motion  imagery,  the  quantization 
error  ceases  to  dominate  as  it  did  in  the  lower-quality  case,  and  the  motion-compensating 
error  becomes  more  influential.  As  a  result,  the  quantization  noise  model  sharpens  this 
motion-compensation  noise.  Such  sharpening  is  not  present  for  the  IID  case,  which  assumes 
that  the  noise  terms  at  each  pixel  observation  are  independent  and  identically  distributed. 
Thus  the  very  quality  that  prevents  the  IID  quantization  noise  model  from  providing  sharp 
restorations  also  prevents  it  from  enhancing  noise. 

To  better  account  for  the  motion  compensation  error,  one  can  explicitly  model  the  term 
Qk,i  j;Pat  the  overall  noise  term  would  be 

n^’^  =  e[  +  e^’K  (29) 

The  motion  compensation  noise  term  has  often  been  modeled  as  IID-Gaussian  dis¬ 
tributed  [22]  with  variance  terms  A^’^;  note  that  is  zero  for  k  =  1.  With  such  an 
assumption,  the  overall  noise  term  for  the  IID  quantization  noise  case  would  have  covariance 
matrix 

Kk,i  =  a^l  +  X'^’^I  (30) 

=  A^’^I,  (31) 

which  for  practical  purposes  is  nearly  equivalent  to  the  IID  quantization  noise  model  used 
previously;  this  explains  why  the  IID  quantization  noise  model  does  not  enhance  the  mo¬ 
tion  compensation  error.  For  the  non-IID  quantization  noise  model,  the  covariance  matrix 
becomes 

K,,;  =  H-'K,,  H-*  +  A'^-'I.  (32) 

The  restoration  algorithms  require  the  inversion  of  the  noise  covariance  matrix,  and  it  should 
be  readily  evident  that  the  inversion  for  the  diagonal  IID  case  is  much  easier  than  for  the 
non-diagonal  and  non-IID  case.  However,  since  only  the  product  of  the  matrix  inverse  with 
some  input  vector  is  needed,  and  not  the  actual  explicit  inverse  matrix,  iterative  methods 
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can  be  used  in  the  non-IID  case.  Appendix  D  discusses  the  iterative  implementation  for 
inversion  of  the  covariance  matrix  in  (32). 

Explicitly  modeling  the  motion  compensation  error  within  the  overall  noise  leads 
to  improvements  in  the  restorations  of  both  IID  and  non-IID  quantization  noise  models. 
Figure  19  shows  results  for  the  experiment  first  reported  in  Figure  18,  but  using  the  modified 
error  covariance  matrices  of  (30)  and  (32)  instead.  As  indicated  in  the  figure  caption, 
the  PSNR  performance  of  the  restoration  algorithm  that  used  the  full  quantization  noise 
covariance  matrix  has  risen  above  that  of  the  IID  quantization  noise  assumption;  in  both 
cases,  PSNR’s  are  higher  than  when  neglecting  the  term.  (Note  that  since  the  IID 
quantization  noise  assumption  already  modeled  fairly  well,  only  marginal  improvements 
result  for  the  IID  case.)  Although  PSNR  has  been  improved,  visual  differences  between  these 
images  and  their  counterparts  presented  in  Figure  18  are  difficult  to  discern.  Similarly,  the 
visual  improvement  of  Figure  19(b)  relative  to  Figure  19(d)  is  comparable  to  that  between 
Figure  18(d)  and  Figure  18(f). 

A  final  observation  about  temporal  filtering  concerns  the  algorithms’  sensitivity  to  errors 
in  motion  estimation.  Motion  compensation  error,  as  used  here,  refers  to  the  difference 
between  an  image  area  and  its  corresponding  area  in  a  reference  image.  Motion  estimation 
error  refers  to  inaccuracies  in  the  registration  of  the  areas  of  the  two  images — for  example, 
saying  that  an  object  has  moved  by  six  pixels  when  in  fact  it  has  only  moved  by  four.  For 
the  same  reasons  that  the  non-IID  quantization  noise  model  leads  to  noise  amplifications  for 
motion-compensation  error,  the  non-IID  model  also  leads  to  error  amplification  for  motion- 
estimation  errors.  Errors  in  motion  estimation  lead  to  large  errors  in  motion  compensation, 
which  when  combined  with  the  sharpening  features  of  the  quantization  noise  model  lead  to 
ringing-like  artifacts.  The  IID  quantization  noise  model  does  not  suffer  from  such  problems, 
because  it  only  averages  pixels  rather  than  sharpening  them;  motion-estimation  errors  for 
the  IID  case  simply  result  in  over-blurring,  and  sometimes  introducing  ghosting  artifacts, 
in  the  restoration.  Additionally,  when  using  block-based  translational  motion-estimation 
techniques  for  constructing  the  motion-compensating  matrices,  ringing  artifacts  can 
sometimes  result  near  the  block  boundaries  for  the  restorations  using  the  quantization  noise 
model.  Such  phenomena  do  not  necessarily  indicate  a  limitation  of  the  quantization  noise 
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Fig.  19.  Restoration  results  for  real  video  data  using  more  complex  error  covariance  matrices  of  (30)  and  (32). 
The  five  input  images  have  PSNR  values  in  the  range  of  36.4  dB  to  36.6  dB.  The  left  column  shows  the  full 
640  X  480  images,  while  the  right  column  shows  zoomed  portions  of  the  images.  Compressed  image  2  was  shown  in 
Figure  18(a,b),  with  PSNR=36.44  dB,  and  is  not  re-displayed  here.  (a,b):  Restored  image  2  using  quantization  noise 
model,  PSNR=38.22  dB;  (c,d):  restored  image  2  using  IID  noise  model,  PSNR=37.88  dB.  Here,  =  20|A:  —  /  |  + 10, 
and  cr^  =  24.1. 


model,  but  simply  reinforce  the  necessity  of  accurately  estimating  correspondences  between 
the  input  images.  Artifacts  can  be  avoided  by  accurately  estimating  and  assigning  the 
motion-compensation  error  covariance  terms^  However,  as  motion  estimation  accuracy 


^In  addition,  this  term  can  be  modified  such  that  the  pixel  variances  vary  from  position  to  position  depending  on 
the  accuracy  of  the  motion  estimation. 
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decreases  (or  motion  compensation  error  increases),  the  by  necessity  become  large 
enough  to  dominate  the  overall  error  covariance  term  /.  When  dominate,  the  resulting 
restorations  are  nearly  identical  to  those  achieved  using  the  IID  model.  If  accurate  motion 
estimation  is  simply  not  possible,  then  temporal  filtering  with  the  IID  quantization  noise 
model  may  be  more  appropriate  due  to  its  robustness  to  errors  in  motion  estimation, 
in  addition  to  its  simpler  implementation.  However,  with  accurate  motion  estimation  the 
quantization  noise  model  provides  better  restoration  results. 
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6.  Conclusion 


Pixel  errors  introduced  by  quantization  of  the  image’s  DWT  coefficients  are  not  indepen¬ 
dent,  nor  are  they  identically  distributed.  Experimental  observations  and  theoretic  deriva¬ 
tions  presented  here  indicate  that  the  errors  at  different  pixel  locations  have  determinedly 
different  characteristics,  and  the  noise  covariance  matrix  developed  in  this  technical  report 
more  accurately  describes  the  quantization  errors  than  does  an  IID  noise  assumption.  As 
one  example  application,  this  report  has  presented  a  novel  deblurring  algorithm  that  makes 
use  of  the  quantization  noise  model  within  a  Bayesian  framework,  whose  superiority  to  the 
same  Bayesian  formulation  with  assumed  IID  noise  was  demonstrated.  A  second  example 
application  was  motion  imagery  restoration  by  temporal  filtering,  where  the  benefits  of  using 
the  proposed  noise  model  over  an  IID  noise  assumption  were  demonstrated.  Appropriate 
applications  of  this  second  example  include  motion  imagery  compressed  using  a  3D  DWT, 
as  well  as  individually-compressed  frames  of  video,  for  example.  Motion  JPEG  2000  [28].  The 
quantization  noise  model  introduced  here  can  find  use  in  many  situations  where  wavelet- 
compressed  imagery  must  be  processed  by  algorithms  that  are  sensitive  to  noise. 
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Appendix  A 

The  DWT  Matrix  and  its  Transpose 


Since  the  structure  of  H*  depends  on  H,  the  structure  of  the  DWT  matrix  H  is  first 
discussed.  Only  the  one-dimensional  case  is  presented;  the  two-  and  three-dimensional  ver¬ 
sions  of  H  can  be  constructed  by  separable  extensions  of  the  one-dimensional  case.  This 
appendix  assumes  the  Daubechies  9/7  wavelet  transform  [18],  although  the  case  for  other 
wavelet  filters  is  analogous.  The  low-  and  high-pass  analysis  filter  coefiicients  are  shown  in 
Table  A.l. 

For  the  even- length  signal  2;[n],n  =  0,...,A/’  —  l,a  single-level  wavelet  decomposition  can 
be  applied  to  yield  low-pass  and  high-pass  coefficients, 

y  =  Hz,  (33) 

where  the  DWT  matrix  H  is  constructed  as 

(34) 

where  and  are  low-  and  high-pass  matrix  operators  of  size  ^  x  N.  The  low-pass 
coefficients  are  taken  from  the  first  y  elements  of  y,  while  the  high-pass  coefiicients  are  taken 
from  the  last  y  elements.  (For  odd  the  given  development  can  be  modified  accordingly.) 


H 


Hn 

ip 

Hn 

hp 


TABLE  A.l 


Low-  AND  HIGH-PASS  DWT  ANALYSIS  COEFFICIENTS. 


low-pass 

high-pass 

0.03782845550 

hs 

-0.02384946501 

93 

0.06453888262 

h2 

-0.11062440441 

92 

-0.04068941760 

hi 

0.37740285561 

9i 

-0.41809227322 

ho 

0.85269867900 

90 

0.78848561640 

hi 

0.37740285561 

9i 

-0.41809227322 

h2 

-0.11062440441 

92 

-0.04068941760 

hz 

h4 

-0.02384946501 

0.03782845550 

93 

0.06453888262 
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The  low-  and  high-pass  matrices  are  constructed  as 

Hq  2hi  2/i2  2/13  2/14  0  •  •  •  0 

h2  hi  +  /i3  ho  /i4  hi  /i2  hs  h^  0 
/i4  hs  h2  hi  ho  hi  /i2  h^  h^  0 

0  0  h4  hs  /i2  hi  ho  hi  h2  hs  h4  0 

0  h4  hs  h2  hi  ho  hi  /i2  hs  /i4  0 

0  h4  ho  h2  hi  ho  hi  /i2  +  h4  ho 

0  -  -  0  h4  ho  h2  +  h4  hi  +  ho  ho  +  h2  hi 

and 

51  50  -I-  52  51  -t  53  52  S3  0  ••  •  0 

S3  S2  Si  So  Si  S2  S3  0 

0  0  93  92  Si  go  91  92  93  0 

Hj;=  X  .  (36) 

•  •  •  0  S3  92  Si  So  Si  92  S3  0 

•  •  •  0  S3  92  Si  So  Si  +  S3  S2 

0  •  •  •  0  2gz  2^2  2gi  go 

The  rows  of  these  matrices  are  shifted  versions  of  a  simple  prototype  low-  or  high-pass 
filter  with  the  exception  of  near  the  signal  boundaries,  where  the  coefficients  arise  due  to 
symmetric  extension  of  the  input  signal.  Note  that  slightly  different  matrices  result  when 
the  signal  length  N  is  odd,  and  also  when  the  subsampling  that  corresponds  to  the  wavelet 
transform  is  different  than  even-lowpass — odd-highpass. 

Additional  levels  of  wavelet  decomposition  are  achieved  by  repetitive  filtering  on  resulting 
low-pass  subbands.  For  example,  a  two-level  wavelet  decomposition  has  a  corresponding 


(37) 


(38) 
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where  I-^  and  0-^  are  the  identity  and  zero  matrices  of  size  N.  One  could  continue  like  this 
indefinitely,  but  a  three-level  example  is  sufficient  for  our  needs  here.  When  written  in  the 
form  above,  the  transpose  of  the  DWT  operation  is  simple  to  perform;  for  the  three-level 
decomposition  of  (38),  it  is 


H 


Nt 

hp 


K'"‘  H 


0«/2 


N/2* 

hp 


i  0^/2 


,  F/2 


„iV/4*  jtN/4^ 
^Ip  ^hp 

1  qJV/4 

1 

1  0^/2 

1 

_L _ 

qN/4 

1  1^/4 

qN/2 

1  1^/2 

(39) 


Each  of  the  three  matrices  on  the  right-hand  side  of  (39)  consists  of  the  transposes  of  the 
low-  and  high-pass  matrices  of  (35)  and  (36).  Taken  individually,  the  components  of  H* 
start  at  the  lowest-resolution  subbands,  apply  the  individual  transpose  operation  at  that 
subband  level,  and  then  apply  the  transpose  operation  at  the  next  highest  subband  level, 
which  continues  until  all  subbands  have  been  processed.  One  only  needs  to  implement  the 
transposes  of  (35)  and  (36),  which  is  readily  achieved  due  to  the  matrices’  sparse  and  simple 
structure. 

It  should  be  noted  that  some  wavelet  transforms  are  orthogonal,  in  which  case  H*H  is 
diagonal;  with  proper  scaling,  such  DWT  matrices  become  orthonormal,  in  which  case  H*  is 
implemented  by  application  of  the  inverse  DWT.  Many  wavelet  transforms,  while  not  strictly 
orthogonal,  are  designed  to  be  nearly  orthogonal;  this  may  tempt  the  unwary  to  the  use  of 
the  inverse  DWT  in  place  of  the  actual  transpose.  Nevertheless,  use  of  the  true  transpose  is 
more  justified  theoretically,  yields  better  results,  and  since  it  can  be  implemented  efficiently 
there  is  little  computational  justification  for  choosing  the  inverse  over  the  transpose.  Use  of 
the  transpose  also  avoids  any  scaling  issues  associated  with  non-unity  gains  in  the  various 
subbands  of  the  wavelet  decomposition,  as  are  present  in  the  wavelet  transform  recommended 
by  the  JPEG  2000  standard  [3]. 
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Appendix  B 

Computation  of  Deblurring  Step  Size 


The  objective  function  being  minimized  is  given  by 

=  x(Bz  -  z<j)*K-/(Bz  -  Zg)  +  A^pT(d*z).  (40) 

^  cGC 

Given  a  direction  g,  the  optimal  step  size  is  found  as 

d  =  argminC(z  —  ag),  (41) 

a  one-dimensional  minimization  along  the  line  g.  Written  explicitly,  we  wish  to  minimize 
C>(z  -  ag)  =  ^(Bz  -Zq-  ttBg)*K“^(Bz  -  z,  -  «Bg)  +  A^PT(d*z  -  «d*g).  (42) 

Taking  the  partial  derivative  with  respect  to  a  and  setting  it  equal  to  zero  results  in  the 
equation  for  the  optimal  a, 


-g'B‘H‘K-‘H(Bz  -  z,  -  aBg)  g'TpWd'z  -  ad'g)  =  0,  (43) 

ceC 

where  Ke^  has  been  expanded  according  to  (3).  The  term  above  that  corresponds  to  the  prior 
probability  is  non-linear,  which  makes  an  exact  solution  difficult.  However,  by  neglecting  the 
higher-order  terms  of  a  Taylor  representation,  the  approximation 

P^dcZ  -  «d*g)  p^dcz)  -  «p^(d*z)d*g  (44) 

arises.  Substituting  (44)  into  (43), 

-g‘B‘H‘K;.'H(Bz  -  z,)  -  g‘A  y  d,p).(d‘z)  + 


ceC 

ag‘B‘H'K,-‘HBg  +  aAg‘  ^  depj(d‘z)d'g  =  0.  (45) 

cec 


which  simplifies  to 

-gWC(z)  +  ag*  B*H*K-;HB  +  a  y  d,p"  (d*z)dG  g  =  0,  (46) 

cec 

where  VO(z)  is  the  gradient  of  (40)  with  respect  to  z,  equivalent  to  the  equation  given 

in  (19).  The  final  estimate  for  the  optimal  step  size  is  then 

gVC(z) 


a  = 


b‘h<k.-;hb  +  a  y  d,pj(d)z)d) 

ceC 


(47) 


g 


Note  that  when  the  direction  g  is  chosen  as  the  gradient  of  the  objective  function,  then 
g  =  V(9(z),  and  the  numerator  of  the  above  equation  matches  that  given  earlier  in  (20). 
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Appendix  C 

Optimization  for  Temporal  Filtering 


To  optimize  (27),  the  method  of  conjugate  gradients  is  used.  For  iteration  z  +  1,  a  new 
estimate  for  is  obtained  from  the  estimate  at  iteration  i  as 

4+1)  =  zfp -«p)d(i),  (48) 

where  z^^^  is  the  estimate  at  iteration  i,  dp)  is  a  direction  vector  and  ap)  is  the  step  size 
that  controls  how  far  along  dp)  from  Zp)  the  new  estimate  is  taken. 

The  direction  is  computed  as 


dp)  =  gp)  +  /5p)dp-i), 


(49) 


where  /?p)  is 


gp)gp) 

P{i)  f 

g4i)g(i-i) 


and  g(j)  is  the  gradient  of  the  objective  function  in  (27),  computed  as 


k+n 


gp)=  E  + 


l=k—n 

In  the  simplifying  case  where  the  gradient  becomes 


k-\-n 

E 

l=k—ri 


g(i)=  E  aPh‘k;/h(Awz‘,  +  m'-’' -  <). 

to)  -9’ 


(50) 


(51) 


(52) 


The  initial  estimate  for  the  image  is  taken  as  and  the  initial  direction  is  taken  as 


d(o)  =  g(o)- 

The  optimal  step  size  «(*)  is  computed  in  a  manner  similar  to  that  of  the  deblurring  case, 
which  was  presented  in  Appendix  B.  The  result  of  a  comparable  derivation  is 

dp)gp) 


k+n 

4)  E  ApKpAt,d,i| 


(53) 


l=k—n 


which,  for  the  case  =  H  ,  becomes 

_ dp)gp) 


«p) 


k+n 


(54) 


djp  E  Al,H‘K;/HA,pd 


P) 


l=k—n 


For  more  details  on  the  conjugate  gradient  method  of  optimization,  consult  [29]. 
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Appendix  D 


Inversion  of  Full  Observation  Error  Covariance  Matrix 

The  inverse  of  the  full  observation  error  covariance  matrix  ^  is  needed  to  implement 
the  optimization  of  Appendix  C  when  7^  0, 

Kj-;=  +  (56) 

Due  to  the  combination  of  the  quantization  noise  covariance  matrix  and  the  motion  com¬ 
pensation  error  covariance  matrix,  the  above  inverse  cannot  be  implemented  through  simple 
operations  as  is  possible  when  only  one  of  the  two  noise  terms  is  present.  Since  the  inverse 
matrix  is  always  applied  to  an  input  vector  v  to  yield  an  output  vector  u,  it  suffices  to  solve 
only  for  the  vector  u  without  explicitly  evaluating  and  storing  the  full  inverse  matrix, 

u  =  V,  (56) 

which  is  equivalent  to  solving  the  following  equation  for  u: 

+  A'=’'l]  u  =  V.  (57) 

Equation  (57)  is  solved  here  using  the  conjugate  gradient  method.  For  notational  conve¬ 
nience,  replace  the  matrix  to  be  inverted  by  the  matrix  P,  such  that  (57)  becomes 

Pu  =  V.  (58) 

Solving  (58)  is  equivalent  to  minimizing 

^u*Pu  —  uV,  (59) 

whose  minimum  is  found  when  the  gradient  g  with  respect  to  u  is  zero. 


g  =  Pu  —  V  =  0.  (60) 

The  u  that  minimizes  (59)  is  computed  using  the  method  of  conjugate  gradients  in  the 
exact  same  manner  as  was  described  in  Appendix  C.  Equations  (49)  and  (50)  are  used  to 
determine  a  direction  dp),  which  when  used  with  a  step  size  of 

=  d*  PdcA  ^  ^ 

leads  to  an  updated  estimate  for  u,  analogous  to  the  update  equation  of  (48).  Note  that  in 

application  of  P,  the  transpose  of  the  inverse  DWT  must  be  evaluated,  i.e.,  Such  a 

function  is  implemented  similarly  to  the  case  for  H*,  which  was  discussed  in  Appendix  A. 
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Symbols,  Abbreviations,  and  Acronyms 


0 

qN 

^k,l 

B 

C 

C 

CCD 

DCT 

DWT 

Ey[u,v] 

GMRF 

H 

H* 

H-i 

Hn 

hp 

Hn 

ip 

HMRF 

I 

F 

IID 

Ke, 

Ke. 

Kei 

Ke^ 

Kk,l 

Ky 

Kz 

Ae,  [mi,  m2] 


zero  matrix  or  vector 

zero  matrix  of  size  N  x  N 

matrix  operator  for  predicting  frame  from 

blurring  matrix  operator 

parameter  of  multivariate  Laplace  distribution 

set  of  all  cliques  in  HMRF 

charge  coupled  device 

discrete  cosine  transform 

discrete  wavelet  transform 

(u,  vy^  element  of  DWT  error  represented  by  ey 

Gauss-Markov  random  field 

DWT  matrix 

transpose  of  matrix  H 

inverse  of  matrix  H 

high-pass  matrix  operator 

low-pass  matrix  operator 

Huber-Markov  random  field 

identity  matrix 

identity  matrix  of  size  N  x  N 

independent  and  identically  distributed 

quantization  error  covariance  matrix  for  Cy 

quantization  error  covariance  matrix  for 

quantization  error  covariance  matrix  for 

quantization  error  covariance  matrix  for 

error  covariance  matrix  for  observations  z^  given  z^ 

covariance  matrix  of  DWT  decomposition  y 

covariance  matrix  of  image  z 

(7711,7712)*^  element  of 
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^N,ri 

MAP 

ML 

MSE 

O(-) 

PSNR 

PSNRe 

PSNR^ 

T 

V 

Y[u,v] 
Z[m,  n] 

d 

d* 

diag(-) 

^k,l 

el 


e; 

g 

Kv 

K^vi'rn] 

l2 


constant  term  in  multivariate  Laplace  pdf 

maximum  a  posteriori 

maximum  likelihood 

mean-squared  error 

objective  function  being  minimized 

peak  signal  to  noise  ratio 

PSNR  at  convergence  in  deblurring  algorithm 

maximum  PSNR  achieved  by  deblurring  algorithm 

Huber  parameter 

normalization  term  in  HMRF 

(u,  element  of  DWT  decomposition  represented  by  y 

(m,  element  of  image  represented  by  z 

direction  vector  in  conjugate  gradient  optimization 

when  applied  to  z,  extracts  second-order  pixel  differences  for  HMRF 

vector  formed  by  taking  diagonal  of  the  matrix  argument 

error  in  predicting  7}  from  z^ 

quantization  error  for  y 

quantization  error  for  y^ 

quantization  error  for  z 

quantization  error  for  7} 

gradient  of  the  objective  function  being  minimized 
basis  image  of  (u,  DWT  coefficient 
mth  element  of 


error  variance  basis  images 

Ha 

acquisition  noise 

error  in  predicting  z^  given  z^ 

P(-) 

probability  distribution  function 

pdf 

probability  distribution  function 

y 

DWT  decomposition  of  z 

yq 

DWT  coefficients  after  quantization 
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z 

Zfe 

Z(i) 


z 


k 


#(a;) 

P 


lb 


V 

A 


Pt{-) 

Pri') 

Pt{') 

a^u,v] 


image  vector  formed  by  stacking  image  columns 
blurred  image 

parenthesized  subscript  denotes  quantities  at  iteration  i  of  an  iterative 
algorithm 

image  vector  at  frame  number  k 
image  after  quantization  of  its  DWT  coefficients 
quantization  step  size  for  DWT  subband 
nominal  quantizer 

quantization  step  size  for  DWT  coefficient 

characteristic  function 

step  size  for  iterative  optimizations 

parameter  in  conjugate  gradient  optimization 

sum  of  squared  image  errors  due  to  unit  error  in  DWT  subband  b 

parameter  of  multivariate  Laplace  distribution 

regularization  parameter  in  HMRF 

variance  for  IID  Gaussian  motion-compensation  noise 

term  to  account  for  unobservable  pixels  when  predicting  z^  from  z^ 

Huber  penalty  function,  parameterized  by  T 

first  derivative  of  pr(‘) 

second  derivative  of  pr(') 

variance  for  IID  Gaussian  quantization  noise 

quantization  error  variance  for  (u,  DWT  coefficient 
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